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Make the most of PhDs 


The number of people with science doctorates is rapidly increasing, but there are not enough 
academic jobs for them all. Graduate programmes should be reformed to meet students’ needs. 


educated. The media, politicians and universities all believe that a 

scientific background will not only benefit individuals, but also drive 
science, innovation and the economy. As a result, the number of people 
entering higher education in the sciences and engineering has been on 
the rise for decades. Between 1995 and 2012, the Organisation for Eco- 
nomic Co-operation and Development reported an overall increase in 
university graduation rates of 22 percentage points. In the same time 
frame, the PhD production rate has doubled, even though PhDs account 
for only a small percentage of higher-education graduations. 

Getting a science PhD can bea very fulfilling experience, in which 
students spend a few years enjoying the rigour and freedom of aca- 
demic research. Many pursue a PhD because they love the science, to 
satisfy their curiosity about the world or to contribute to a growing 
body of knowledge. All hope to emerge with the skills to pursue career 
goals, within or outside academia. 

They have a good chance of doing so, too. Science PhD holders 
experience very low unemployment rates — just 2.1% of people with 
doctorates in science, engineering or health in the United States were 
unemployed in 2013, according to the National Science Foundation 
(NSE) Survey of Doctorate Recipients. The overall national unemploy- 
ment rate for people aged 25 or older was 6.3%. 

But the chances of getting a faculty job in academia — the career 
dream of many — are slim. Of the employed doctorate holders in the 
NSF survey, just over 50% were working outside academia across a 
variety of sectors, including industry, federal government and non- 
profit organizations. Many young researchers feel that their graduate 
training does not adequately prepare them for these different careers. 
Nor do they feel that they are being properly informed of their future 
prospects or the realities of the training. Many principal investigators, 
universities, funding bodies and governments are keen to keep pushing 
the message that a science PhD is good, and that there are plenty of jobs 
in academia. Job markets are not fixed, and can change substantially 
between the start and end ofa lengthy PhD. 

The opportunity cost of a PhD can also be substantial for young 
people. Not all have the luxury of being able to spend several years 
ona PhD with low pay and no clear destination — they cant afford 
it, might miss out on other opportunities, or prefer to pursue deep 
training in another sphere that is more appropriate for their skills and 
chosen careers. 

If we accept that there are positives to having lots of PhD holders, 
then we need to work out how the system should change to support 
them all. As a News Feature on page 22 explores, various suggestions 
are bouncing around. One is to revamp the PhD so that it combines 
research with development of workplace skills. Several institutions, 
such as the University of California, San Francisco, run courses that 
offer graduate students training in management, communication and 
entrepreneurship. 


|: is hard to argue against the idea that a workforce should be highly 


Students could also skip the PhD entirely. Many who are 
contemplating a doctorate but aren't sure of its value to their future could 
instead experience postgraduate research through a master’s degree. 

A more controversial idea — around since the 1970s — is to cut the 
total number of graduate students entering 


“If we accept the system. This has met with stiff resistance 
that there are from faculty members, university funding 
positives to bodies and governments. 

having lots of The biggest problem for early-career 
PhD holders, researchers seems to be a lack of data on the 
then we need career trajectories and opportunities available 
to work out to them. Although some information-gather- 
how the system ing efforts exist, none is substantial enough 
shouldchangeto _ to provide the detail needed for students to 


make informed decisions about their futures. 

To create a happy, sustainable PhD popula- 
tion, collaborative efforts between students, academics, industry and 
government leaders are needed. A science PhD is no longer an appren- 
ticeship in science for academia, but an apprenticeship in scientific 
thinking that is beneficial for all walks of life. There are already some 
grass-roots campaigns in this direction, but they are not enough. The 
welfare and future of the economy and science rest on the shoulders 
of young, highly educated workers. Policymakers need to start putting 
the graduates’ needs first. m 


support them.” 


Root causes 


Research has a part to play in identifying the 
factors that breed terrorism. 


fighting terrorism, and to improving counterterrorism 
policies? A common focus is a narrow concept of radicaliza- 
tion that explores why individuals turn to extremism. 

Since the 11 September terrorist attacks in the United States in 
2001 — and the deadly bombings in Madrid in 2004 and London 
in 2005 — an entire industry of government-funded consultants and 
researchers has grown up around this idea. But many researchers find 
such emphasis problematic; they argue, for example, that it can distract 
from the need for a broader understanding of the roots of terrorism. 
They also fear that counterterrorism policies based on it may be inef- 
fective, and risk being counterproductive. 

Research to understand why and how people, such as the young 
people who carried out the attacks in Paris on 13 November, become 
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radicalized is crucial, and as a News story on page 20 describes, such 
research has provided some important insights. But there is no typi- 
cal profile of those who turn to violent extremism, and the causes are 
highly diverse. Radicalization has become a central plank in national 
counterterrorism policies, with efforts made to identify individuals 
and groups showing signs of radicalization or vulnerability, and to 
de-radicalize them. 

Under Britain’s ‘Prevent’ programme, the government earlier this 
year made it compulsory for staff in schools, universities, councils, 
prisons and other bodies to monitor or refer such individuals to the 
authorities. Yet it’s clear that only a tiny minority of the vast numbers 
of people flagged by counter-radicalization efforts risk turning to 
terrorism — and spotting which ones is extremely difficult. 

Some researchers argue that such policies are justified on the 
grounds that the immediate terrorist threat to democracies is so 
great that everything must be done in the short term to stop people 
from becoming radicalized and to spot violent extremists, while also 
addressing the broader causes and dynamics of terrorism. But other 
researchers question the effectiveness of such policies, and argue that 
the focus should be more on community policing and on reinforcing 
intelligence to identify the recruiters and ringleaders of terrorist net- 
works. Social profiling, they add, comes at a potentially high cost. It 
risks stigmatizing further Muslims and those of immigrant origin, and 
inadvertently legitimizing the anti-Islam, and often racist, rhetoric of 
extreme-right-wing parties. The resulting social division risks mak- 
ing matters worse and increasing the pool of potential terror recruits. 

As Nadia Fadil, who specializes in Islam in Europe at the University 
of Leuven in Belgium, points out, policies based on targeting 


Muslim populations also risk harming existing community-based 
prevention methods. In Belgium, youth workers, teachers and other 
officials who were previously considered trustworthy bridge-builders 
are increasingly distrusted in the communities in which they work 

because they are now perceived as state spies. 
Shortcomings of radicalization research itself were highlighted 
in a 2013 review of the research literature by Alex Schmid, a direc- 
tor of the Terrorism Research Initiative, 


“There is no an international consortium of research- 
typical profile ers and research centres. It concluded that 
of those who the search for the causes of radicalization of 
turn to violent young people has produced “inconclusive 
extremism, and results’, and that counter-radicalization and 
the causes are de-radicalization programmes lack rigorous 


evaluation. 

Most worryingly, the review highlighted 
blind spots in radicalization research. Much, it concluded, is 
“one-sided”, in that it looks only at the radicalization of Islamist, 
non-state actors, and ignores the fact that radicalization of Western 
governments can also occur, combining in a vicious circle that can 
fuel strife and terrorism. 

That is an unpopular view, and is often considered by politicians and 
the media to be making excuses for terrorism. But it must be taken into 
account to develop more effective policies and to identify those that are 
ineffective or even harmful. 

Research can do its bit, by bringing an evidence-based, neutral 
and broader perspective that can enlighten counterterrorism, social, 
educational and other policies. That need is now greater than ever. = 


highly diverse.” 


Take more risks 


Scientific innovation is being smothered by a 
culture of conformity. 


uppose you are devising a technique to transfer proteins from a gel 
to a plastic substrate for easier analysis. Useful, maybe — but will 
you gain kudos for it? A notable finding of last year’s survey of the 
100 most cited papers on the Web of Science (see Nature 514, 550; 2014) 
was how many of them reported such apparently mundane methodo- 
logical research (this protein-transfer method came in at number six). 

Not all prosaic work reaches such bibliometric heights, but that 
does not deny its value. Overcoming the hurdles of nanoparticle drug 
delivery, for example, requires the painstaking characterization of 
pathways and rates of breakdown and loss in the body — work that 
is probably unpublishable, let alone unglamorous. One can cite com- 
parable demands for detail to get just about any bright idea to work 
in practice — but it’s usually the initial idea, not the hard grind, that 
garners the praise. The incentives for such boring but essential col- 
lection of fine-grained data to solve a specific problem are vanishing 
in a publish-or-perish culture. 

Meanwhile, a recent analysis of discovery and innovation in biomedi- 
cine, using the molecules studied as value markers, finds that the choice 
of research problems is becoming more conservative and risk-averse 
(A. Rzhetsky et al. Proc. Natl Acad. Sci. USA 112, 14569-14574; 2015). 
One might quibble with the scope of the study, but its general conclu- 
sions — that current norms discourage risk and therefore slow down 
scientific advance, and that the problem is worsening — ring true. 

Attempts to hit the publishable ‘sweet spot’ by avoiding both the 
prosaic and the risky are likely to reduce the efficiency of scientific 
discovery. But a fashionably despairing cry of ‘Science is broken!’ is 
not the way forward. The wider virtue of Rzhetsky et al.’s study is that 
it floats the notion of tuning practices and institutions to accelerate 
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the process of scientific discovery. The researchers conclude, for 
example, that publication of experimental failures would assist this 
goal by avoiding wasteful repetition. Journals chasing impact factors 
might not welcome that, but they are no longer the sole repositories 
of scientific findings. Rzhetsky et al. also suggest some shifts in insti- 
tutional structures that might help promote riskier, but potentially 
more groundbreaking, research — for example, spreading both risk 
and credit among teams or organizations. 

The danger is that efforts to streamline discovery simply become 
codified into another set of guidelines and procedures, creating yet 
more hoops for grant applicants to jump through. 

A better first step would be to recognize the message that research 
on complex systems has emphasized: efficiencies are much more 
likely to come from the bottom up. The aim is to design systems with 
basic rules of engagement for participating agents that best enable an 
optimal state to emerge. Such principles typically confer adaptability, 
diversity and robustness. There could be a wider mix of grant sources 
and sizes, say, less rigid disciplinary boundaries, and wider acceptance 
that citation records are not the only measure of worth. 

But perhaps more than anything, the current narrowing of 
objectives, opportunities and strategies in science reflects an erosion of 
trust. Obsessive focus on ‘impact’ and regular scrutiny of bibliometric 
data betray a lack of trust that would have sunk many discoveries 
and discoverers of the past. Bibliometrics might sometimes be hard 
to avoid as a first-pass filter for appointments (see Nature 527, 279; 
2015), but a steady stream of publications is not the only, or even the 
best, measure of potential. 

Attempts to tackle these widely acknowledged problems are 
typically little more than a timid rearranging of deckchairs. Partly that’s 
because they are seen as someone else’s problem: the culprits are never 
the complainants, but the referees, grant agencies and tenure com- 
mittees who oppress them. Yet oddly enough, 
these obstructive folk are, almost without excep- 
tion, scientists too (or at least, they once were). 
Inefficiencies can exact a huge price. It is time 
to oil the gears. m 
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deaths, linked to human papillomavirus (HPV). We havea highly 
effective HPV vaccine, but suspicion stands in the way of its 
adoption in many countries. How can we dispel this mistrust? 

On 20 November, a report from the European Medicines Agency 
(EMA) confirmed the vaccine’s safety. The agency had been asked by 
Denmark to reinvestigate after symptoms of dizziness, fainting, aches 
and pains were reported in adolescent girls and suspicion fell on the 
vaccine. It is not the only country to report such events. 

The good news is that public concern about these reactions is being 
heard and has prompted further investigation. The EMA report is one 
of many to confirm the safety of the vaccine and conclude that there is 
no need to change vaccination policies. 

The not-so-good news is that not everyone believes them. 

Evidence suggests that the events were ‘psycho- 
genic illnesses, psychological reactions that can 
spread fast, especially when girls are vaccinated 
in groups at school and witness each other's reac- 
tions. A growing collection of YouTube clips is also 
fuelling anxieties. 

My research group studies situations in which 
public, provider or political trust in vaccines has 
been broken. We have heard many testimonies of 
the anxiety that politicians and decision-makers 
face when pressured about suspected vaccine 
reactions while also hearing that scientific evi- 
dence exonerates the vaccines. We have learned 
the importance of monitoring public sentiment, 
responding promptly to concerns and engaging 
and listening to the public early on when vaccines 
are being introduced. 

In some nations, politicians side with the science. In others, they 
bend to minority opinions. Japan reacted ambiguously to reports of 
HPV vaccine side effects: it withdrew ‘proactive recommendation of 
the vaccine while it investigated, but continued to provide the vaccine 
for those who demanded it. The investigations found no clear causal link 
to the vaccine, but the recommendation remains suspended. 

In another case, in 2010, we investigated the suspension of HPV 
vaccine demonstration projects in two Indian states. Vaccination accept- 
ance was high in the projects; the pressure had come from an activist 
women’s group far away in New Delhi. When the group’s demands for 
public dialogue about the safety, efficacy and cost-effectiveness of the 
initiative were not answered, it found, and widely reported, seven deaths 
among girls who had participated. 

These deaths were judged unrelated to the vac- 


Be year brings 528,000 new cases of cervical cancer and 266,000 


cine, but the projects never resumed. Nearly five NATURE.COM 
years later, millions of women are missing out on _ Discuss this article 
the chance to prevent cervical cancer. One-quar- _ online at: 
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The world must accept that 
the HPV vaccine ts safe 


But the science alone will not be enough to build public and political 
confidence, says Heidi Larson. 


Some governments stand by the science even when faced with public 
panic. Last year, 600 girls in a Colombian municipality reported symp- 
toms after HPV vaccination. Faced with local anxieties and some anger, 
the Colombian government expressed empathy, and the vaccination 
programme continues. England reached 87% full-dose coverage in 
2014, having averted a potential public-confidence crisis in 2009, when 
a 14-year-old girl died after being vaccinated. Health officials expressed 
concern, promptly investigated the girl’s death and found it unrelated 
to the vaccine. 

Psychogenic reactions are not unique to HPV vaccination. During 
the 2009 H1N1 influenza pandemic, there were 23 episodes of mass 
psychogenic illness in Taiwan’s school flu-vaccination programme. In 
Iran, people panicked after 10 girls in a class of 26 experienced psycho- 
genic reactions after tetanus shots. 

Ilearned about the Iran situation while working 
with UNICEF just over a decade ago, when I was 
asked to help plan a nationwide measles cam- 
paign — and, specifically, to design ways to pre- 
empt the type of panic provoked by the tetanus 
vaccine reactions. The measles campaign was a 
success, but it took considerable advance work 
that included gathering local input into communi- 
cation materials and outreach early in their prepa- 
ration; engaging young people (the campaign was 
targeting everyone under 25 years old); and work- 
ing with schools, local leaders and the media. 

The HPV vaccine carries unique challenges. 
Because the first thing it prevents is sexual trans- 
mission of HPV, use of the vaccine evokes moral 
judgements around sexual behaviour. 

The United States is struggling to get HPV 
vaccination coverage above 40%. Some parents are anxious that the vac- 
cine will make their daughters more promiscuous, even though multiple 
studies have found no such effect. Other reports cite ‘embarrassment in 
some cultures about accepting the vaccine. 

The HPV vaccine touches nerves, and acceptance needs strategies 
that vary between cultural and political settings. Despite the challenges, 
more than 80 million girls and women around the world have received 
the vaccination. 

We should not underestimate the potential for progress to be 
disrupted by the mass spread of vaccine reactions and concerns, the 
amplification that can follow through social media and the vulnerabil- 
ity of political processes, which sometimes find themselves paralysed 
between public and scientific opinion. = 


Heidi Larson heads the Vaccine Confidence Project at the London 
School of Hygiene & Tropical Medicine, where she is a senior lecturer 
in the Department of Infectious Disease Epidemiology. 

e-mail: heidi.larson@lshtm.ac.uk 
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Supernova glow 
shows stellar twin 


The force of an exploding star 
may have ripped material off 
an orbiting companion star, 
leaving behind a signature 
glow. 

Astronomers first spotted 
the massive explosion of 
supernova iPTF 13ehe in 
2013. Two years later, they 
noticed an afterglow coming 
from clouds of hydrogen 
nearby. Takashi Moriya at the 
University of Bonn in Germany 
and his colleagues argue that 
this unexpected light came 
from material that was torn 
off another star during the 
violent original outburst. The 
energy from the blast could 
have blown away part ofa 
tightly orbiting companion 
star, stripping offa mass of 
hydrogen that could weigh 
nearly as much as the Sun. 

Careful scrutiny of hydrogen 
emissions from other 
especially bright supernovae 
could determine whether 
this radiance stems from 
companion stars or matter 
that is already present in the 
surrounding interstellar space. 
Astron. Astrophys. 584, L5 (2015) 


Alzheimer’s role of 
breast-cancer gene 


The DNA-repair protein 
BRCA1 is known to increase 
the risk of breast and ovarian 
cancer when it is mutated. 
But the normal protein might 
also have a central role in 
Alzheimer’s disease. 

Elsa Suberbielle and Lennart 
Mucke at the Gladstone 
Institute of Neurological 
Disease in San Francisco, 
California, and their colleagues 
lowered BRCA1 protein levels 
in mouse brains by blocking 
the BRCA1 gene using a small 
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Ecological toll of African infrastructure 


Huge development projects such as roads and 
railways that are planned or under construction 
in Africa threaten swathes of its ecosystems. 
William Laurance and his team at James 
Cook University in Cairns, Australia, 
mapped 33 ‘development corridors’ that 
are being upgraded or planned, plus their 
human populations and surrounding lands. 
They found that these corridors would 
stretch 53,000 kilometres and cut through 
408 protected areas, 29 of which would be cut 


piece of RNA. They found 
that some neurons shrank in 
size and that the animals had 
areduced ability to learn and 
remember their way around 
amaze. The researchers also 
showed that BRCA1 levels 
were depleted in the post- 
mortem brains of people 
with Alzheimer’s. By looking 
at this process in cultured 
neurons and in mice, the 
authors suggest that BRCA1 
is degraded when amyloid-f 
proteins accumulate in the 
brain in Alzheimer’s disease. 
Nature Commun. http://doi. 
org/9kk (2015) 
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Earth’s magma 
Inaspin 

The rapid spin of the early 
Earth could have influenced 
the way that the planet 
solidified. 

Some 4.5 billion years ago, 
Earth was extremely hot, 
covered by a molten magma 
ocean, and completed a 
full rotation in a few hours. 
Christian Maas and Ulrich 
Hansen of the University of 
Minster in Germany calculate 
that the fast rotation could 


by two or more corridors. 

Corridors are often justified on the basis of 
their benefits to agricultural production, but the 
team found just five that would have both low 
environmental impact and large agricultural 
benefit. Six would degrade areas with high 
conservation value and bring low agricultural 
benefits, and the rest would bring only 
“marginal” returns. Many of the developments 
would cause serious and irreversible damage. 
Curr. Biol. http://doi.org/9kg (2015) 


have influenced how crystals 
settled from the magma 
ocean and shaped Earth’s 
interior. They used a three- 
dimensional model of the 
formation of silicate crystals 
in magma, and found that 

a fast rotation rate created a 
crystal layer that settled deeper 
beneath the poles than under 
the equator. 

This could have played a key 
part in how Earth’s mantle layer 
eventually solidified out of the 
magma ocean, say the authors. 
J. Geophys. Res. Solid Earth 
http://dx.doi.org/10.1002/ 
2015JB0121053 (2015) 
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Genome shows 
gecko evolution 


The first genome of a gecko 
species hints at the basis of 
its ability to regrow tails and 
climb walls. 

More than 1,400 species of 
gecko inhabit temperate areas 
across the world. A team led 
by Huanming Yang at BGI in 
Shenzhen and Xiaosong Gu at 
Nantong University, both in 
China, sequenced the genome 
of Schlegel’s Japanese gecko 
(Gekko japonicus; pictured) 
and identified more than 
22,000 genes. Comparisons 
with other reptile and 
vertebrate genomes show that 
geckos diverged from other 
lizards around 200 million 
years ago, after the split of two 
supercontinents. 

The gecko genome harbours 
dozens of copies of B-keratin 
genes — expressed in hair-like 
growths called setae that help 
the animal to cling to vertical 
surfaces. Expression of two 
genes that make the hormone 
prostaglandin increased in 
geckos after their tails had been 
amputated, suggesting a role for 
this hormone in regeneration. 
Nature Commun. 6, 10033 (2015) 
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Ozone destruction 
in a future climate 


The potency of one of the 
major ozone-destroying gases 
could double because of future 
climate change. 

Nitrous oxide (N,O) leads 
to ozone destruction through 
various chemical reactions 
in the stratosphere, and is 
the main ozone-destroying 
gas released by human 
activity. Laura Revell at the 
Swiss Federal Institute of 
Technology in Zurich and her 
colleagues analysed the ozone- 
depletion potential of this gas 
using different scenarios of 
future climate change. The 


models showed that ozone 
destruction involving N,O 

is made less efficient by the 
higher concentrations of 
carbon dioxide and methane 
that are expected in the 
atmosphere by 2100. However, 
the team found that other 
changes in atmospheric 
chemistry, temperature and air 
circulation by 2100 could still 
increase the ozone depletion 
potential of N,O by as much as 
two-fold relative to 2000. 
Geophys. Res. Lett. 
http://doi.org/9h2 (2015) 


Africa’s herbivores 
mapped 


Researchers have constructed 
a map of Sub-Saharan Africa 
showing the types of plant- 
eating animals that grazed it 
some 1,000 years ago. 

Gareth Hempson, of the 
University of Cape Town 
in South Africa, and his 
colleagues used factors such 
as species distribution, rainfall 
and vegetation patterns to 
model the likely biomass of 
92 large herbivores across 
Sub-Saharan Africa around 
1,000 years ago. They 
divided the region into areas 
each measuring around 
12,000 square kilometres, and 
grouped areas that had similar 
biomass and animal types into 
four herbivore regimes. They 
named these ‘herbivomes’ 
after the forest duiker, the 
arid gazelle and the bulk 
feeder, with the fourth regime 
containing a high variety and 
abundance of larger species. 
The analysis should assist 
research on the loss of large 
plant-eaters and improve 
understanding of African 
ecology, the team says. 
Science 350, 1056-1061 (2015) 


Pollination is more 
than bees 


Other creatures visit more 
flowers than bees do, and may 
be almost as important in 
pollinating crops. 

Romina Rader at the 
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Popular topics 
on social media 


Credit and co-authors cause chatter 


Questions of paper authorship have been plaguing scientists on 
social media: who should come first? And who deserves to be 
listed at all? When it comes to papers with numerous authors, 
the publishing process can get messy. For instance, when 
Dorothy Bishop, a psychologist at the University of Oxford, 
UK, found herself trying to review a paper blemished with 
mistakes, she tweeted: “When a manuscript with 20+ authors 
has grammatical errors, typos and/or no page numbers, you 
wonder how many authors actually read it.” Others took a less 
dark view. Deirdre Toher, a statistician at the University of the 
West of England in Bristol, UK, tweeted that the logistics of 
implementing changes from multiple 


> NATURE.COM 
For more on 
popular papers: 
go.nature.com/noociy 


University of New England in 
Armidale, Australia, and her 
colleagues analysed data from 
39 field studies of pollination 
by honey bees, other bees and 
other insects, including flies, 
beetles, moths and ants. They 
found that other insects carried 
out 25-50% of all visits to 
crop flowers. Although these 
‘non-bees’ were less effective at 
pollinating on each visit, their 
increased visits made them 
roughly as effective as bees. 
Crops such as coffee and 
grapefruit were almost 
exclusively pollinated by bees, 
whereas crops such as custard 
apples and mangoes relied 
almost totally on other insects. 
Non-bees were also found to 
be less affected by changes to 
natural habitats, so the authors 
suggest that these insects 
might provide a more robust 


pollination service than bees do. 


Proc. Natl Acad. Sci. USA 
http://dx.doi.org/10.1073/ 
pnas.1517092112 (2015) 


Pigeon leaders 
fly faster 


Birds that lead a social 
group learn faster than their 
followers, although the leaders 
might not start out as the best 
decision-makers. 

Benjamin Pettit and 


are ‘someone else’ responsibility. 


researchers may have led to the mistakes, 
adding “with that many authors it also 
means that people assume that the basics 


2» 


Dora Biro at the University 
of Oxford, UK, and their 
colleagues tracked the 
behaviour of 40 homing 
pigeons (pictured) as the 
birds navigated various routes, 
both individually and as a 
flock. They found that birds 
that later assumed leadership 
of flocks had been the fastest 
fliers on previous solo flights, 
but had not necessarily 
navigated the shortest and 
most energy-efficient routes. 
On later solo flights, leaders 
learned to navigate along 
direct routes more quickly 
than followers did. The team 
suggests that, among pigeons 
at least, leadership is based 
on pre-existing individual 
differences rather than on 
social preferences or optimal 
group decision-making. 

Curr. Biol. http://doi.org/9kb 
(2015) 
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Anthrax vaccine 


An anthrax vaccine has 
become the first to be 
approved by the US Food 

and Drug Administration 
(FDA) under the ‘Animal Rule, 
which allows approval on the 
basis of animal tests when 
studies in humans are not 
ethical or possible. The FDA 
announced on 23 November 
that the vaccine, called 
BioThrax, can be used after 
exposure to Bacillus anthracis, 
the bacterium that causes 
anthrax. BioThrax was initially 
approved in 1970 to prevent 
anthrax before exposure to 

the bacterium. The vaccine is 
made by Emergent BioDefense 
Operations Lansing in 
Michigan. 


Retraction data 

A searchable database 

should soon allow systematic 
identification of retracted 
publications. Posts and 

article identifiers from the 
blog Retraction Watch will 

be incorporated into a web 
application maintained by 
Center for Open Science 

in Charlottesville, Virginia, 
that already tracks research 
activities such as posting 
preprints or depositing data 
sets. The resource will initially 
have about 5,000 entries, 

and was announced by both 
organizations on 24 November. 


LHC heavy metal 
After spending five months 
colliding protons following a 
major upgrade this year, the 
Large Hadron Collider (LHC) 
near Geneva, Switzerland, 
began a one-month run of 
experiments with heavy 

ions on 25 November. 

All main detectors at the 
accelerator — including 
ALICE, which was designed 
for this purpose — are now 
studying the state of matter 
known as quark-gluon 


Blue Origin gets to space and back 


Commercial spaceflight company Blue 
Origin — the brainchild of Jeff Bezos, head 
of online retail giant Amazon — completed 
a test ofits reusable rocket on 23 November. 
The autonomous vehicle was successfully 
landed after it propelled a capsule to a height 


plasma, which can arise 
when two nuclei of lead-208 
collide. In these collisions, 
the nuclei carry a record- 
breaking energy of more than 
1 petaelectronvolt. 


} RESEARCH 
Emissions stall 


Humanity’s greenhouse-gas 
output increased by just 0.5% 
in 2014, despite significant 
global economic growth, 
according to figures released 
on 25 November. Carbon 
emissions rose by 3-4% 

per year in the first decade 

of the twenty-first century, 
but that growth has slowed 
dramatically over the past 

3 years, report the Netherlands 
Environmental Assessment 
Agency and the European 
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Commission's Joint Research 
Centre. The biggest factor is 
China, where slower economic 
growth and a shift towards 
cleaner energy sources 

and less energy-intensive 
manufacturing have reduced 
the energy intensity of the 
economy. See go.nature.com/ 
kphlae for more. 


Deforestation rises 


The rate of legal deforestation 
in the Amazon rainforest 

has risen over the past year, 
Brazilian environment minister 
Izabella Teixeira announced 
on 26 November. Satellite 
images show that 5,831 square 
kilometres of forest were lost 
to activities such as livestock 
farming and agriculture in the 
year up to July 2015, a 16% 
increase on the previous year. 


of more than 100 kilometres, which is classed 

as being in space. The flight comes just seven 
months after one of the company’s rockets was 
destroyed during a similar test. Blue Origin has 
not yet completed a crewed flight; the capsule is 
designed to carry up to six passengers into space. 


The increases were largest in 
the states of Rondénia, Mato 
Grosso and Amazonas. Of 
these, Mato Grosso had the 
biggest area of forest loss, 

at 1,508 square kilometres. 
Efforts by the Brazilian federal 
government have generally 
been bringing down rates of 
deforestation, and the current 
rate is around one-fifth of that 
in 2004. 


PF UNDING 
Energy partnership 


A group of 28 investors from 
10 countries has launched 

a multibillion-dollar clean- 
energy research partnership. 
The Breakthrough Energy 
Coalition, spearheaded 

by Microsoft founder Bill 
Gates, and including Virgin 


BLUE ORIGIN/VIA ZUMA WIRE/REX SHUTTERSTOCK 


STAFF/REUTERS/CORBIS 


SOURCE: G. CARPENTER ET AL. MAR. POLICY 64, 9-15 (2016) 


founder Richard Branson and 
Amazon boss Jeff Bezos, was 
announced on 30 November, 
on the opening day of the 
international climate-change 
negotiations in Paris. The 
private partnership aims to 
support early-stage research 
into low-carbon technologies 
for future energy supply. It will 
complement energy-research 
efforts announced by US 
President Barack Obama and 
French President Francois 
Hollande on the same day, 
dubbed ‘Mission Innovation. 
See go.nature.com/wzigmx for 
more. 


POLICY 


Carbon plan canned 
On 25 November, the UK 
government scrapped a 
£1-billion (US$1.5-billion) 
competition to build a 
demonstration carbon capture 
and storage plant. Funding 

for the project — intended 

to demonstrate that carbon 
dioxide can be filtered out of 
power-plant exhaust gases on 

a commercial scale — has been 
on the table since 2012, but was 
removed from government 
plans in the latest five-year 
spending review. 


Rhino-horn ban 

A South African court has 
lifted a ban on the domestic 
trade in rhino horn (pictured) 
after two game farmers 
claimed that it infringed their 


right to trade in a renewable 
substance. On 26 November, 
the judge ruled that the ban, 
introduced in 2009, had not 
undergone proper public 
consultation. He added that 
since 2008 the number of 
South African rhinos poached 
for their horns has increased 
from less than 100 per year to 
around 1,200. Conservation 
group Save the Rhino asked 
how a national ban could fuel 
poaching, which mainly serves 
overseas markets, given that 
the international trade is illegal. 
The South African government 
is to appeal the ruling; the 

law will stay in place until the 
appeal has been heard. 


Open-access policy 
The Netherlands Organisation 
for Scientific Research (NWO) 
is tightening its open-access 
policy to demand that research 
results become universally 
available as soon as authors 
publish them. NWO-funded 
researchers were previously 


obliged either to publish in 

an open-access journal or to 
submit a version of their work 
toa public database ‘as soon 
as possible’ after publishing 

in a pay-to-read journal. 
From 1 December, new grant 
conditions require Dutch 
researchers to make work 
immediately accessible. To 
avoid conflicting with journals 
that enforce embargo periods, 
such as Nature, researchers 
can submit pre-peer-review 
versions to a database. 


Animal clones 


A huge animal-cloning centre 
in Tianjin, China, will open 
early in 2016. Launched 

with 200 million yuan 
(US$31.3 million) from Sinica, 
a subsidiary of BoyaLife in 
Wuxi, the Tianjin International 
Joint Academy of Biomedicine, 
Peking University in Beijing 
and Sooam Biotech in Seoul, 
the centre will clone cattle, dogs 
and racehorses. BoyaLife says 
that the aim is to produce one 
million cloned cow embryos 
annually to help Chinese 
farmers to meet demand for 
beef. 


Italian expo 

The Italian government 
enacted a decree on 

25 November that allocates 
€80 million (US$85 million) 
to launch a major research 
centre to focus on big-data 


TREND WATCH 


European politicians often allow 
more fish to be taken from the 
seas than is recommended by 
scientists. Yet this excess varies 

by country, according to a study 
(G. Carpenter et al. Mar. Policy 
64, 9-15; 2016). In 2001, the total 
catch permitted in the European 
Union averaged 33% more than 
that advised by the International 
Council for the Exploration of the 
Sea. In 2015, this fell to 7% above 
the advised level. EU politicians 
negotiate catch limits in secret, but 
more transparency is needed, say 
the authors. 


OVERFISHING IN EUROPE 


On average, Denmark and the United Kingdom received the highest 
catch allocations in excess of scientific advice between 2001 and 2015. 


Denmark 

United Kingdom 
France 

Sweden 

The Netherlands 
Spain 

Ireland 
Germany 
Poland 

Finland 

Latvia 

Estonia 

Portugal 
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SEVEN DAYS | THIS WEEK | 


3-4 DECEMBER 

The first International 
Workshop on 
Metamaterials-by-design 
takes place in Paris. 
go.nature.com/hkzg9s 


8-9 DECEMBER 
The Royal Society 

of Medicine and the 
Nutrition Society in 
London jointly host a 
meeting that will look 
at the role of sleep in 
obesity and nutrition. 
go.nature.com/xigwne 


exploitation in health and 
nutrition, as well as 
nanotechnologies. Called 
Human Technopole, the centre 
will take over part of the site 
used for the 2015 international 
exhibition called Milan Expo. 
It will continue the theme 

of the exhibition — ‘feeding 
the planet, energy for life. 
Human Technopole will be 

led by the Genoa-based Italian 
Institute of Technology and will 
eventually employ more than 
1,000 researchers. 


PEOPLE 
Maurice Strong 


Maurice Strong, the founding 
head of the United Nations 
Environment Programme 
(UNEP) and a leading figure 
in climate-change politics, 

has died aged 86. He was a 
major figure in organizing the 
1992 Rio Earth Summit and 
creating the UN Framework 
Convention on Climate 
Change. Strong is regarded 

as one of the most important 
people in the history of 

the environmental and 
sustainability movements. Ina 
statement released by UNEP on 
28 November, Achim Steiner, 
the current head of the agency, 
called him a visionary anda 
pioneer of global sustainable 
development. 
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The Italian Senate has approved a budget amendment that would award €3 million to a stem-cell trial. 


Italian scientists 
slam trial selection 


Senate assigns a stem-cell trial €3 million — but 
researchers callinstead for an open competition. 


BY ALISON ABBOTT 


some biomedical scientists by hand-picking 
a stem-cell clinical trial for funding. 

The €3-million (US$3.2-million) pot for 
the trial should be allocated through an open 
competition based on scientific merit rather 
than in an amendment to the country’s 2016 
budget bill, say the researchers. They are 
appealing to Italy’s Parliament to change the 
amendment — which the Senate approved on 


[= politicians have kindled the wrath of 


20 November — before it passes into law. 

To many Italian scientists, the idea that 
certain projects receive funding at the whim of 
politicians feels depressingly familiar. In 2013, 
a government decree earmarked this money 
for a stem-cell clinical trial run by a specific 
clinic. That earlier allocation was to the con- 
troversial Stamina Foundation in Brescia, and 
was later abandoned after a fraught, year-long 
campaign by scientists who convinced the 
health ministry that the trial was based on bad 
science and illegal practices. In the objections 


to the amendment to the 2016 budget bill, 
there is no suggestion of scientific wrongdo- 
ing, but researchers take issue with the way that 
the trial selection was made. 

“Italian politicians can take a liking to a 
project and then finance it directly,’ says 
Marino Zerial, a director at the Max Planck 
Institute of Molecular Cell Biology and 
Genetics in Dresden, Germany. He is one of 
29 researchers who signed a letter published 
in the newspaper La Stampa on 26 November 
calling for the budget-law amendment to be 
changed. “Where do you see this in any other 
country?” 

After the Stamina trial was abandoned, 
many Italian scientists assumed that politicians 
would never again earmark specific research 
projects for public funding. The Constitutional 
Court, reflecting on the Stamina debacle, 
stated that the selection of clinical trials to 
receive public funding should not be at the 
“pure political discretion of the legislator”. 

The amendment to the 2016 budget law 
does not name a specific trial: it states that the 
money should be given for a “phase II clinical 
trial based on the transplantation of human 
neural stem cells in patients affected with 
Amyotrophic Lateral Sclerosis” (ALS), a deadly 
neurodegenerative condition also known as 
motor neuron disease. But of the 11 stem-cell 
protocols that have been approved for early- 
phase clinical trials in Italy, only one meets 
these criteria — so the outcome amounts to a 
selection in practice, says Giuseppe Remuzzi, 
director of the Mario Negri Institute for Phar- 
macological Research in Bergamo, one of the 
letter’s signatories. 

The protocol comes from the lab of stem-cell 
researcher Angelo Vescovi, scientific director 
of the Casa Sollievo della Sofferenza research 
hospital in the southern Italian province of 
Foggia. It involves transplanting neurons 
derived from the brains of miscarried fetuses 
into the spinal cords of people with ALS. 

Senator Giorgio Santini, who proposed the 
amendment, told Nature that the way it had 
been selected and added to the bill was pro- 
cedurally correct and that, if signed into law, 
the amendment would help sick people. He 
added that, in principle, any scientist who met 
the criteria could apply for the money, but that 
Vescovi'’s trial is the only one that is ready to be 
implemented. 

Vescovi declined to comment on the fact 
that his work was proposed for funding, 
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> but noted that Santini was present at a 
public meeting in Rome on 29 September, 
where Vescovi presented the final results of 
his phase I ALS trial. 

Unlike the criticisms levelled against the 
Stamina trial, the scientists who wrote the 
letter complaining about the amendment 
told Nature that they are not commenting 
on the quality of Vescovi’s research. They 
point out that a phase I trial in six patients 
that assessed the safety of the therapy was 
carried out with appropriate oversight of the 
health authorities (L. Mazziniet al. J. Transl. 
Med. 13, 17; 2015). 

Their concerns lie 


(t9 
with how the trial We should 
. have had the 
was selected, which A 
is ticularl oo opportunity 
particularly pain ly.” 
ful because Italy’s to apply. 


funds for research 

are among the lowest in Europe, and the 
2016 budget foresees no increase for public 
institutions. “Scarce resources should be 
allocated to the most absolute transpar- 
ency rules’, they write. Remuzzi says that 
the criteria were so narrow that scientists 
like himself could not apply. “We ourselves 
have three stem-cell trials related to organ 
transplantation,” he says. “We should have 
had the opportunity to apply” 

Patient groups have also objected to the 
way the funding was allocated, saying that 
therapies for other illnesses should have 
been allowed to compete. The Italian Mul- 
tiple Sclerosis Foundation in Genoa has 
sent politicians evidence of other ready- 
to-go Italian stem-cell trials for various 
neurodegenerative diseases. And the Ital- 
ian Association for Huntington’s Chorea 
in Milan complained to the Chamber of 
Deputies — Italy’s second parliamentary 
house, which will vote on the budget bill 
next — that the amendment appeared 
without revealing the scientific criteria 
used to select one clinical trial from among 
other candidates. 

Scientists hope that their outcry will 
encourage the Chamber of Deputies to 
drop or modify the amendment in the next 
week or so. Time is tight: the 2016 budget 
law is linked to a confidence vote in the gov- 
ernment, which means that Parliament is 
under pressure to approve it before the end 
of the year. = 
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Crews lower huge drills down from specially equipped ships to penetrate the sea floor. 


Drill ship targets 
Earth’s mantle 


Indian Ocean expedition resumes quest to bore right 


through the planet’s crust. 


BY ALEXANDRA WITZE 


into the sea floor, through kilometres of the 

planet's rocky crust to penetrate the denser 
underlying mantle. It is one of geology’s classic 
quests, conceived almost 60 years ago, at the 
peak of the plate-tectonics revolution. Since 
then, many have attempted it and failed. But 
an expedition starting this month is taking up 
the challenge once again. 


Js Verne would have dug this plan: drill 


com/fbhrzk 
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@ Billion-dollar boost for clean energy 
kicks off UN climate talks go.nature. 


In early December, the drill ship JOIDES 
Resolution will depart Colombo, Sri Lanka, 
and head for a spot in the southwestern 
Indian Ocean known as Atlantis Bank. 
There, it will lower a drill bit and try to screw 
it through 1.5 kilometres of rock, collecting a 
core sample as it goes. If all goes well, future 
expeditions — not yet scheduled or funded 
— will return and finalize the push into the 
mantle (see ‘Deep understanding’). 

Normally, the crust-mantle boundary is 
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thought to be marked by a feature known as 
the Mohorovici¢é discontinuity, or ‘Moho, at 
which seismic waves change velocity. But at 
Atlantis Bank, the mantle is thought to bubble 
up as far as 2.5 kilometres above the Moho, 
making it easier to reach. 

Reaching these deep-Earth frontiers “is one 
of the great scientific endeavours of the cen- 
tury’, says Henry Dick, a geophysicist at the 
Woods Hole Oceanographic Institution in 
Massachusetts and co-leader of the expedition. 

Beneath continents, the Moho lies 30-60 
kilometres down. But beneath oceans it is 
close enough to be reached with ship-borne 
drilling equipment. In the drilling campaign 
— dubbed the Slow Spreading Ridge Moho, 
or ‘SloMo’ Project — Dick hopes to reach the 
crust-mantle transition at Atlantis Bank, then 
one day return with a state-of-the-art Japanese 
vessel to reach the Moho itself at a depth of 
5 kilometres or more. Along the way, scientists 
aim to answer profound questions about the 
planet, such as how molten rock rises from the 
interior and cools to form fresh ocean crust, a 
surface that blankets three-fifths of Earth. 


LONG-HELD DREAM 

A hole that deep “would be the window into 
things we have never seen before’, says Benoit 
Ildefonse, a geologist at the University of 
Montpellier in France. 

Scientists first tried to reach the Moho in 
the middle of the twentieth century. In the 
1960s, US scientists led ‘Project Mohole’, 
which drilled into the sea floor off Guadalupe 
Island, Mexico. The project reached a depth 
of just 183 metres before costs ballooned and 
Congress killed it. Still, Project Mohole gave 
birth to a series of scientific ocean-drilling 


programmes that have extracted cores from 
hundreds of locations around the world. 
These have revolutionized Earth science by 
retrieving sedimentary records that date back 
millions of years, offering clues to how con- 
tinents pull apart and finding microbial life 
deep beneath the sea floor. 

“We live on this Earth and we ought to 
know something about what happens beneath 
us,’ says Walter Munk, an oceanographer at 
the Scripps Institution of Oceanography in 
La Jolla, California, who conceived Project 
Mohole with colleagues over cocktails one 
evening in 1957. He is gratified by the success 
of scientific ocean drilling overall, but would 
still like to see the mantle breached. 

Expeditions have come close before. 
Between 2002 and 2011, four holes at a site 
in the eastern Pacific managed to reach fine- 
grained, brittle rock that geologists believe 
to be cooled magma sitting just above the 
Moho. But the drill could not punch through 
those tenacious layers. And in 2013, drillers 
at the nearby Hess Deep found themselves 
similarly limited by tough deep-crustal rocks 
(K. M. Gillis et al. Nature 505, 204-207; 2014). 

Dick and his colleagues are targeting the 
Indian Ocean ridge rather than the eastern 
Pacific because much smaller quantities of 
lava feed the sea floor there, so there is less 
hard rock to drill through. At Atlantis Bank, 
tectonic forces have lifted the sea floor to just 
700 metres beneath the waves. 

Dick knows that it is possible to reach his 
preliminary goal of 1.5 kilometres, because he 
has done it before. In 1997, he led an expedi- 
tion to Atlantis Bank that got that deep before 
disaster struck: the pipe snapped off in high 
winds, corkscrewed down inside the hole and 
plugged it up. “We're going to make sure that 
doesn’t happen this time,’ he says. 

Along the way, researchers hope to explore 
not just geology, but biology, too. Geologi- 
cal mapping suggests that seawater may have 
percolated several kilometres deep at Atlantis 
Bank, triggering chemical reactions that turn 
the rock into a type known as serpentinite. 
These reactions generate methane, a gas that 
sub-sea-floor microbes often munch for 
energy. JOIDES Resolution scientists will be 
checking the rock cores for microorganisms, 
says Virginia Edgcomb, a microbiologist at 
Woods Hole who will be on the cruise. 

SloMo’s first phase runs until 30 January. 
If the drilling goes well, Dick hopes to return 
with the JOIDES Resolution to reach 3 kilo- 
metres. And after that, he and his colleagues 
hope to use the Japanese drill ship Chikyu in 
the project’s third phase to drill all the way 
to the Moho. Launched a decade ago, Chikyu 
was meant to drill to the Moho in the west- 
ern Pacific, but technical challenges and a 
lack of funding means that has not happened 
yet. With a capacity to drill as deep as 6 kilo- 
metres, Chikyu could finally allow geologists 
to realize their almost 60-year old dream. = 
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DEEP UNDERSTANDING 


The SloMo Project in the Indian Ocean aims 
to drill three times deeper than an attempt in 
1997 managed, to penetrate Earth’s mantle 
and possibly reach a geophysical transition 


called the Moho. 
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Particle collisions at the Large Hadron Collider produce huge amounts of data, which algorithms are well placed to process. 


PARTICLE PHYSICS 


Artificial intelligence called 
in to tackle LHC data deluge 


Algorithms could aid discovery at Large Hadron Collider, but raise transparency concerns. 


BY DAVIDE CASTELVECCHI, GENEVA, 
SWITZERLAND 


he next generation of particle-collider 
"TPeceriments will feature some of 

the world’s most advanced thinking 
machines, if links now being forged between 
particle physicists and artificial intelligence 
(AI) researchers take off. Such machines could 
make discoveries with little human input — a 
prospect that makes some physicists queasy. 

Driven by an eagerness to make discoveries 
and the knowledge that they will be hit with 
unmanageable volumes of data in ten years’ 
time, physicists who work on the Large Had- 
ron Collider (LHC), near Geneva, Switzerland, 
are enlisting the help of AI experts. 

On 9-13 November, leading lights from 
both communities attended a workshop — 
the first of its kind — at which they discussed 
how advanced AI techniques could speed 
discoveries at the LHC. Particle physicists 
have “realized that they cannot do it alone’, 
says Cécile Germain, a computer scientist 
at the University of Paris South in Orsay, 
who spoke at the workshop at CERN, the 


particle-physics lab that hosts the LHC. 

Computer scientists are responding in 
droves. Last year, Germain helped to organ- 
ize a competition to write programs that could 
‘discover’ traces of the Higgs boson in a set of 
simulated data; it attracted submissions from 
more than 1,700 teams. 

Particle physics is already no stranger to 
AL In particular, when ATLAS and CMS, the 
LHC’s two largest experiments, discovered the 
Higgs boson in 2012, they did so in part using 
machine learning — a form of AI that ‘trains’ 
algorithms to recognize patterns in data. 
The algorithms were primed using simula- 
tions of the debris from particle collisions, and 
learned to spot the patterns produced by the 
decay of rare Higgs particles among millions 
of more mundane events. They were then set 
to work on the real thing. 

But in the near future, the experiments will 
need to get smarter at collecting their data, not 
just processing it. CMS and ATLAS each cur- 
rently produces hundreds of millions of col- 
lisions per second, and uses quick and dirty 
criteria to ignore all but 1 in 1,000 events. 
Upgrades scheduled for 2025 mean that 
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the number of collisions will grow 20-fold, 
and that the detectors will have to use more 
sophisticated methods to choose what they 
keep, says CMS physicist Maria Spiropulu 
of the California Institute of Technology in 
Pasadena, who helped to organize the CERN 
workshop. “We're going into the unknown,” 
she says. 

Inspiration could come from another LHC 
experiment, LHCb, which is dedicated to 
studying subtle asymmetries between particles 
and their antimatter counterparts. In prepara- 
tion for the second, higher-energy run of the 
LHC, which began in April, the LHCb team 
programmed its detector to use machine learn- 
ing to decide which data to keep. 

LHCb is sensitive to tiny variations in tem- 
perature and pressure, so which data are inter- 
esting at any one time changes throughout the 
experiment — something that machine learn- 
ing can adapt to in real time. “No one has done 
this before, says Vladimir Gligorov, an LHCb 
physicist at CERN who led the AI project. 

Particle-physics experiments usually take 
months to recalibrate after an upgrade, says 
Gligorov. But within two weeks of the energy 


CERN 


upgrade, the detector had ‘rediscovered’ a 
particle called the J/'¥ meson — first found 
in 1974 by two separate US experiments, 
and later deemed worthy of a Nobel prize. 

In the coming years, CMS and ATLAS 
are likely to follow in LHCb’s footsteps, say 
Spiropulu and others, and will make the 
detector algorithms do more work in real 
time. “That will revolutionize how we do 
data analysis,’ says Spiropulu. 

An increased reliance on AI decision- 
making will present new challenges. Unlike 
LHCb, which focuses mostly on finding 
known particles so they can be studied in 
detail, ATLAS and CMS are designed to dis- 
cover new particles. The idea of throwing 
away data that could in principle contain 
huge discoveries, using criteria arrived at by 
algorithms in a non-transparent way, causes 
anxiety for many physicists, says Germain. 
Researchers will want to understand how 
the algorithms work and to ensure they are 
based on physics principles, she says. “It’s a 
nightmare for them.” 

Proponents of the approach will also 
have to convince their colleagues to aban- 
don tried-and-tested techniques, Gligorov 
says. “These are huge collaborations, so 
to get anew method approved, it takes 
the age of the Universe” LHCb has about 
1,000 members; ATLAS and CMS have 
some 3,000 each. 

Despite these challenges, the most 
hotly discussed issue at the workshop was 
whether and how particle physics should 
make use of even more sophisticated AI, in 
the form of a technique called deep learn- 
ing. Basic machine-learning algorithms are 
trained with sample data such as images, 
and ‘told’ what each picture shows — a 
house versus a cat, say. But in deep learning, 
used by software such as Google Translate 
and Apple’s voice-recognition system Siri, 
the computer typically receives no such 
supervision, and finds ways to categorize 
objects on its own. 

Although they emphasized that they 
would not be comfortable handing over 
this level of control to an algorithm, sev- 
eral speakers at the CERN workshop dis- 
cussed how deep learning could be applied 
to physics. Pierre Baldi, an AI researcher 
at the University of California, Irvine who 
has applied machine learning to various 
branches of science, described how he and 
his collaborators have done research sug- 
gesting that a deep-learning technique 
knownas dark knowledge might aid — fit- 
tingly — in the search for dark matter. 

Deep learning could even lead to the 
discovery of particles that no theorist has 
yet predicted, says CMS member Maurizio 
Pierini, a CERN staff physicist who co- 
hosted the workshop. “It could be an insur- 
ance policy, just in case the theorist who 
made the right prediction isn't born yet.” m 
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Brain study seeks roots 


of suicide 


A clinical trial will look at the neurological structure and 
function of people who have attempted suicide. 


BY SARA REARDON 


S uicide is a puzzle. Less than 10% of people 


with depression attempt suicide, and 

about 10% of those who kill themselves 
have never been diagnosed with any mental- 
health condition. 

Now, a study is trying to determine what 
happens in the brain when a person attempts 
suicide, and what sets such people apart. The 
results could help researchers to understand 
whether suicide is driven by certain brain 
biologies — and is not just a symptom of a 
recognized mental disorder. 

The project, which launched in November, 
will recruit 50 people who have attempted 
suicide in the 2 weeks before enrolling. Carlos 
Zarate, a psychiatrist at the US National Insti- 
tute of Mental Health in Bethesda, Maryland, 
and his colleagues will compare these people's 
brain structure and function with those of 
40 people who attempted suicide more than 
a year ago, 40 people with depression or anxi- 
ety who have never attempted suicide and a 
control group of 40 healthy people. In doing 
so, the researchers hope to elucidate the brain 
mechanisms associated with the impulse 
to kill oneself. 

Zarate’s team will also give ketamine, a 
psychoactive ‘party drug; to the group that 
has recently attempted suicide. Ketamine, 
which is sometimes used to treat depres- 
sion, can quickly arrest suicidal thoughts and 
behaviour — even in cases in which it does 
not affect other symptoms of depression’. The 
effect is known to last for about a week. 

To some researchers, such findings 
suggest that ketamine affects brain circuits 
that are specific to suicidal thinking. But 
John Mann, a psychiatrist at Columbia Uni- 
versity in New York City, says that abnor- 
mal brain chemistry and genetics could also 
predispose a person to attempt suicide in 
times of great stress, such as after a job loss. 
“They're part of the person, they’re a trait,” 
Mann says. “They just get more important 
when the person gets ill” 

There is evidence that genetics influences 
a person's suicide risk. For instance, bio- 
logical relatives of adopted children who 
kill themselves are several times more likely 
to take their own lives than the general 
population’. 


Fabrice Jollant, a psychiatrist at McGill 
University in Montreal, Canada, suggests that 
this genetic influence is related to impulsivity 
and flawed judgement, rather than to a specific 
mental illness. He has found that close relatives 
of people who killed themselves were more 
impulsive than a control group when playing 
a gambling game designed to test decision- 
making’. “It seems that this is something 
transmitted, Jollant says. 

Other researchers are seeking biomark- 
ers that would allow clinicians to spot the 
people most at risk of suicide. Alexander 
Niculescu, a psychiatrist at Indiana Univer- 
sity in Indianapolis, 


“In most clinical and his colleagues 
trials, people have identified* 
at high risk a set of six genes 
of suicide are whose expression is 
excluded, so we altered in the blood 
don’t know how __ of people who have 


killed themselves. 
The team has found 
that combining these biomarkers with data 
from an app that tracks mood and risk fac- 
tors can predict, with more than 90% accu- 
racy, whether people with bipolar disorder or 
schizophrenia will eventually be hospitalized 
for a suicide attempt. 

Researchers hope that a better under- 
standing of the biology that underlies sui- 
cide will lead to more effective treatments for 
suicidal impulses. But studies such as Zarate’s 
present difficult logistical and ethical chal- 
lenges. Researchers must consider whether 
a person who has just attempted suicide can 
make informed decisions about whether to 
participate in research. 

Those who study suicidal people say that 
they treat them with special care — and that 
the overall benefits of such studies outweigh 
any risks. “In most clinical trials, people at high 
risk of suicide are excluded, so we don't know 
how to treat them,” Jollant says. “We need to 
assess this population, not just say ‘exclude 
them from trials” = 


to treat them.” 
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UK science 
budget goes up 


Celebrations as spending 
set to rise with inflation. 


BY ELIZABETH GIBNEY 


K scientists’ worst funding fears 

| have not come to pass. The coun- 

try’s science budget will rise slightly 

in the coming years, Chancellor of the 

Exchequer George Osborne said in a much- 
anticipated government spending review. 

Ahead of the review, scientists had braced 
for the possibility that spending would 
remain flat — as it has for the past five 
years — and continue to be whittled away 
by inflation, or even be cut. But speaking in 
the House of Commons on 25 November, 
Osborne announced that the £4.7-billion 
(US$7.1-billion) science budget will now 
rise with inflation. This would amount to an 
extra £500 million for science annually by the 
end of the decade, according to the Treasury. 
Osborne also committed to increasing the 
£1.1-billion annual budget for science infra- 
structure to £1.2 billion a year by 2020-21. 

Scientists’ initial reaction was relief. “If 
the science budget is really protected in 
real terms, then that is good news,” says 
Lee Cronin, a chemist at the University of 
Glasgow. Naomi Weir, acting director of 
the Campaign for Science and Engineer- 
ing in London, said in a statement: “This 
announcement is great news for the UK” 

However, Cronin and others noted that 
there is work to be done to reverse the dam- 
age caused by the flat budget. Although the 
increase in infrastructure spending will 
be helpful, Cronin adds that it needs to be 
“used to help replace essential equipment 
and provide the upgrades needed urgently, 
rather than just fund shiny new projects”. 

While acknowledging that the outcome 
could have been much worse, Jenny Rohn, 
who chairs the UK lobby group Science is 
Vital, highlighted that the science budget is 
smaller in real terms in 2015 than it was in 
2010, owing to erosion by inflation. 

The science budget will also have to cover 
a new Global Challenges research fund, 
aimed at addressing the problems faced by 
developing countries. 

Osborne announced that the govern- 
ment would implement the recommenda- 
tions of a review by geneticist Paul Nurse to 
create Research UK, a new umbrella body 
to oversee the seven research councils that 
distribute most of the science budget. m 
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Hommage national aux victimes 


des attentats du 13 novembre 
Paris - Vendredi 27 novembre 2015 


q 


France’s President Frangois Hollande attends a national tribute to the victims of the Paris terrorist attacks. 


TERRORISM 


Why Europeans 
turn to jihad 


Terrorism is tough to study, but researchers have gleaned 
insights from the current generation of Islamist extremists. 


BY DECLAN BUTLER 


on 13 November that left 130 dead and 

more than 350 wounded, Alain Fuchs, 
president of the French National Centre for 
Scientific Research (CNRS), announced a 
fresh call for proposals for research on terror- 
ism. Acknowledging that any effort with no 
immediate effect may seem “derisory’, Fuchs 
said that science can help to open up avenues 
of analysis. 

The Islamist terror group ISIS also carried 
out deadly attacks this year in Tunisia, Leba- 
non, Bangladesh and other countries, and 
downed a Russian airliner in the Sinai Penin- 
sula. But as thousands of Europeans have left 
to join Islamist groups in conflict zones, and 
are at risk of returning home trained to carry 
out further attacks, the continent is on edge. 

Terrorism researchers are trying to under- 
stand how young people in Europe become 
radicalized, by looking for clues in the life 
histories of those who have committed or 
planned terrorist acts in recent years, left 
the continent to join ISIS, or are suspected 
of wanting to become jihadists. A mixture of 
sociologists, political scientists, anthropologists 


I: the wake of the terrorist attacks in Paris 


2015 
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and psychologists, such researchers are draw- 
ing on information generated by police, judicial 
inquiries and the media, and, in some cases, on 
interviews. They also study factors at play in 
prisons and socially-deprived areas. Some of 
their insights are summarized here. 


Religion is not the trigger. The rise of jihad 
in Europe has led to an assumption that there 
is a radicalization of Muslims more generally 
across the continent. Yet research suggests 
that most extremists are either people who 
returned suddenly to Islam or converts with 
no Islamic background, says Olivier Roy, who 
specializes in political Islam and the Middle 
East at Italy’s European University Institute 
near Florence — and as many as one in four 
French jihadists is a convert. Roy summarized 
the latest research at a conference organized 
in Mainz on 18-19 November by the German 
Federal Criminal Police Office. 

Violent extremism emerges first, with a 
religious justification tagged on after, adds Rik 
Coolsaet, head of political science at Ghent 
University in Belgium, who studies jihadis 
and foreign policy. He notes that two young 
British men who were jailed last year on ter- 
rorism offences after fighting in Syria had 
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earlier ordered online the books Islam for 
Dummies and The Koran for Dummies. 


Resentment is the common ground. It is 
difficult to make generalizations about how 
people become radicalized in Europe. At the 
Mainz conference, Roy said that many extrem- 
ists come from broken families or deprived 
areas, lack education and are unemployed. A 
smaller number are well educated, have held 
jobs and have middle-class lifestyles. Some are 
in stable relationships and have young children. 
The characteristics that extremists seem to share 
are resentment directed at society and a narcis- 
sistic need for recognition that leaves them open 
toa narrative of violent glory, said Roy. 

Social factors can contribute to such frus- 
trations, according to Farhad Khosrokhavar, a 
CNRS researcher who works at the School for 
Advanced Studies in Social Sciences in Paris. 
Almost all European extremists and terrorists 
are second- and third-generation immigrants, 
whom Khosrokhavar says are often “stigma- 
tized, rejected and treated as second-class citi- 
zens. However, since about 2013, the profile 
of those leaving to fight in Syria has included a 
much larger proportion of middle-class youth 
than in previous generations, he says. 


Terrorism breeds in prisons. The link between 
terrorism and prison was highlighted this 


year. The three terrorists involved in the Janu- 
ary attack in Paris on the satirical publication 
Charlie Hebdo and a kosher supermarket, as 
well as some of the 13 November attackers, had 
all done time. 

Many French terrorists have a history of petty 
crime that landed them in prison. Stays there 


often proved seminal 

Since about 2013, experiences on their 
the profile of path to radicalization, 
those leaving to ee pee ange aie 
2 ° ° who spent severa 
fis ame lair years interviewing 
: some 160 staff and 

larger proportion inmates at 4 large 
of middle-class French prisons, 
youth. including 15 inmates 


sentenced for terror- 
ism offences. He says that prisoners often come 
under the influence of — and form lasting bonds 
with — radical Islamists and terrorist networks. 


‘Entrepreneurs’ drive terrorism. Most of those 
who get involved in jihadi terrorism in Europe 
are “misfits and drifters” — people who joined 
militant networks during life crises or through 
friends and relatives on the inside, says Petter 
Nesser, a terrorism researcher at the Norwegian 
Defence Research Establishment in Kjeller. 
But he says that the key actors in terrorist 
activity are a much smaller number of 
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“entrepreneurs”. These seasoned, ideologically 
driven activists are part of transnational terrorist 
webs linked both to extremist groups through- 
out Europe and to armed groups in conflict 
zones. They are the ones who bring structure 
and organization to the disaffected majority, 
through recruitment and indoctrination. 


Molenbeekisn’t the terrorist capital of Europe. 
Several of the terrorists involved in the latest 
Paris attacks, and the perpetrators of previous 
attacks in Europe, had lived in the Molenbeek 
district of Brussels, which has a large Muslim 
community, mostly of Moroccan descent. This 
has led some politicians and media outlets to 
label it Europe's terrorism capital — and to 
blame factors such as social deprivation or an 
apparent lack of integration of Muslims. 

“This is misleading,” says Nesser. Jihadi hot 
spots have emerged across Europe in environ- 
ments ranging from poor suburbs, to universi- 
ties and schools, to prisons. The key ingredient 
in the spread of jihadism in any location is a 
critical mass of jihadist entrepreneurs, he says. 

A focus on Molenbeek obscures the fact that 
European jihadism is transnational, Nesser 
says, and that its main drivers are armed con- 
flicts and militant groups involved in those 
conflicts. He adds: “It is also unfair and stigma- 
tizing towards the inhabitants of this Belgian 
suburb.” mSEEEDITORIALP.7 
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ild 
a better PhD 


There are too many PhD students for too few academic 
jobs — but with imagination, the problem could be solved. 


BY JULIE GOULD 


partake in birth control, but no one has been listening,” said 
Paula Stephan to more than 200 postdocs and PhD students at 
a symposium in Boston, Massachusetts, in October this year. 
Stephan is a renowned labour economist at Georgia State Univer- 
sity in Atlanta who has spent much of her career trying to understand 
the relationships between economics and science, particularly bio- 
medical science. And the symposium, ‘Future of Research; discussed 
the issue to which Stephan finds so many people deaf: the academic 
research system is generating progeny at a startling rate. In biomedi- 
cine, said Stephan. “We are definitely producing many more PhDs 
than there is demand for them in research positions.’ 
The numbers show newly minted PhD students flooding out of 
the academic pipeline. In 2003, 21,343 science graduate students in 


CC S ince 1977, we've been recommending that graduate departments 
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the United States received a doctorate. By 2013, this had increased 
by almost 41% — and the life sciences showed the greatest growth. 
That trend is mirrored elsewhere. According to a 2014 report look- 
ing at the 34 countries that make up the Organisation for Economic 
Co-operation and Development, the proportion of people who leave 
tertiary education with a doctorate has doubled from 0.8% to 1.6% 
over the past 17 years. 

Not all of these students want to pursue academic careers — but 
many do, and they find it tough because there has been no equivalent 
growth in secure academic positions. The growing gap between the 
numbers of PhD graduates and available jobs has attracted particu- 
lar attention in the United States, where students increasingly end up 
stuck in lengthy, insecure postdoctoral research positions (see Nature 
520, 144-147; 2015). Although the unemployment rate for people with 
science doctorates is relatively low, in 2013 some 
42% of US life-sciences PhD students graduated 
without a job commitment of any kind, up from 
28% a decade earlier. “But still students continue to 
enrol in PhD programmes,’ Stephan wrote in her 
2012 book How Economics Shapes Science. “Why? 
Why, given such bleak job prospects, do people 
continue to come to graduate school?” 

One reason is that there is little institutional 
incentive to turn them away. Faculty members rely 
on cheap PhD students and postdocs because they 
are trying to get the most science out of stretched 
grants. Universities, in turn, know that PhD stu- 
dents help faculty members to produce the world- 
class research on which their reputations rest. 
“The biomedical research system is structured 
around a large workforce of graduate students 
and postdocs,” says Michael Teitelbaum, a labour 
economist at Harvard Law School in Cambridge, 
Massachusetts. “Many find it awkward to talk 
about change.” 

But there are signs that the issue is becoming less taboo. In 
September, a group of high-profile US scientists (Harold Varmus, Marc 
Kirschner, Shirley Tilghman and Bruce Alberts, colloquially known 
as ‘the Quartet’) launched Rescuing Biomedical Research, a website 
where scientists can make recommendations on how to ‘fix’ differ- 
ent aspects of the broken biomedical research system in the United 
States — the PhD among them. “How can we improve graduate educa- 
tion so as to produce a more effective scientific workforce, while also 
reducing the ever-expanding PhD workforce in search of biomedical 
research careers?” the site asks. 

Nature puta similar question to 33 PhD students, scientists, postdocs 
and labour economists and uncovered a range of opinions on how to 
build a better PhD system, from small adjustments to major overhauls. 
Allagreed on one thing: change is urgent. “Academia really is going to 
have to be dragged kicking and screaming into the twenty-first cen- 
tury,’ says Gary McDowell, a postdoctoral fellow at Tufts University in 
Medford, Massachusetts, and a leader of the group behind the Future 
of Research symposium. The renovation needs to happen now, says 
Jon Lorsch, director of the US National Institute of General Medi- 
cal Sciences in Bethesda, Maryland. “We need to transform graduate 
education within five years. It’s imperative. There’s a lot at stake for 
scientists, and hence for science.” 


TRACK THE PHD 


One place to begin is with hard facts: show prospective students 
and supervisors data on trainees’ chances of moving into academic 
research or other careers. Prospective students “aren't thinking stra- 
tegically about what they really want to do or what they’re best suited 
for’, says Patricia Labosky, a programme director for scientific training 
at the US National Institutes of Health (NIH) in Bethesda, Maryland. 

A 2015 Nature survey of more than 3,400 science graduate students 


“We need to 
transform 
graduate 
education 
within five 
years. It’s 
imperative.” 
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around the world suggested that many were overly optimistic about 
their chances in academia. About 78% of respondents said that they 
were “likely” or “very likely” to follow an academic career, and 51% 
thought that they would land some type of permanent job in one to 
three years. In reality, only about 26% of PhD students in the United 
States move into tenured or tenure-track positions, and getting there 
can take much longer than this (see ‘Ups and downs of PhDs’). 

But although some data exist about career paths, there are key gaps 
relating to the range of job opportunities, earnings, time spent as a 
postdoc and long-term career trajectories, says Julia Lane, an econo- 
mist at New York University. A January report on post-PhD careers 
by the US Council of Graduate Schools in Washington DC found 
that there are no standardized ways to collect information on gradu- 
ates after they have left their educational institution; only around 
one-third of universities in the United States and 
Canada formally compile such data. 

In October, Stanford University in California 
published the results of a major effort to track grad- 
uates either 5 or 10 years after their PhD. It showed 
that the number of bioscience PhD students pro- 
gressing to postdoctoral positions had dropped 
from 41% to 31% in the more recent graduate 
group, and that many were moving into business, 
government or non-profit positions. This probably 
reflects the growing bottleneck in academic jobs 
and booming opportunities in business. 

Lane is leading a more comprehensive effort 
to track career outcomes in research called 
UMETRICS, which is based at the University of 
Michigan in Ann Arbor. By combining anonymized 
human-resource and administrative data from uni- 
versities with US Census Bureau data on earnings, 
places of work and job titles, UMETRICS will be 
able to produce campus-level reports on the career 
outcomes of graduate students. A student interested 
in a chemistry PhD, for example, could scan a campus report and see 
what previous graduates went on to do, where they went and how much 
they earn. It will take several years before the first data sets are released, 
Lane says — but when they are, “the students opting in to graduate 
schools will go in with eyes wide open’. 


REVAMP THE PHD 


Many PhD students enjoy the intellectual freedom of a PhD for a 
few years and then successfully move on to other things. But a lot of 
students want more preparation and training for that step — such as 
building skills in management, budgeting or negotiation. “Apparently, 
you have to learn these things somewhere on the side, since you are 
supposed to spend all your time as a PhD and postdoc doing research,” 
says Joanna Klementowicz, a postdoc at the University of California, 
San Francisco (UCSF). 

The current graduate education system in many countries is 
based on an apprenticeship model, wherein lab heads train younger 
researchers in the craft of research. This system has been prominent 
since the 1800s, when the first ‘modern PhD was awarded by the 
University of Berlin. Although the scientific enterprise has changed 
dramatically since then, the PhD system has not. 

Modernizing the PhD could improve training in areas of research 
ranging from reproducibility to experimental design and entre- 
preneurship. It could also help to solve the bottleneck problem by 
equipping doctorate holders with soft skills that make them more 
employable wherever they go. “We need to tailor graduate education 
to meet the needs of students without violating what it means to be a 
scientist,’ says Alan Leshner, chief executive emeritus of the Ameri- 
can Association for the Advancement of Science in Washington DC. 

Some funding bodies and research institutions have already taken 
this on board. In 2013, the NIH started the Broadening Experiences 
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in Scientific Training (BEST) initiative — a US$3.7-million pro- 
gramme that is designed to improve training for biomedical PhDs 
and postdocs. “We got a lot of feedback from [employers] that 
the graduates weren't ready for careers outside of academia,” says 
Labosky, who heads the programme. 

At UCSF, PhD students on the BEST programme spend nine 
months training in areas such as management, interviewing and 
networking, and are put into groups that work together to explore 
career objectives. “The programme made me practical: I learned to 
look out for what I can apply for, what my skills were matched to and 
what people with a PhD like mine go on to do,’ says Klementowicz, 
who took the programme as a postdoc. 

Some scientists would like to see particular emphasis put on 
teamwork to reflect the increasingly collaborative nature of 
research. David Golan, dean of graduate education at Harvard 
Medical School in Boston, Massachusetts, is considering how to 
ingrain teamwork more deeply into the graduate-school experi- 
ence. “We have toyed with the idea of having students form a team 
before they apply to grad school,” he says. They might then be given 
a project to work on together throughout their training — and 
perhaps even be examined together. 


There may be too many PhD graduates for academia, but there is 
plenty of demand for highly educated, scientifically minded workers 
elsewhere. So some scientists propose that the PhD should be split 
into two: one for future academics and a second to train those who 
would like in-depth science education for use in other careers. 

Biologist Anthony Hyman, director of the Max Planck Institute of 
Molecular Cell Biology and Genetics in Dresden, Germany, is one of 
those who thinks that a split PhD might work. Students in the aca- 
demic-track PhD would focus on blue-skies research and discovery, 
he says. A vocational PhD would be more structured and directed 
towards specific careers in areas such as radiography, machine learn- 
ing or mouse-model development. 

A similar concept already exists in engineering: students in the 
United Kingdom, the United States, France and Germany can 
choose to study for either an academic-style PhD in engineering or 
a doctorate in engineering (EngD), which is designed with indus- 
trial careers in mind and often involves a supervisor in industry 
alongside one in academia. David Stanley, who manages an EngD 
programme that focuses on nuclear engineering at the University 
of Manchester, UK, says that the programme is aimed at supply- 
ing industry with employees. “Graduates with an EngD are highly 
valued in industry, more than those with PhDs, because of their 
extended training,” he says. 

Elsewhere, industrial PhDs are taking shape in the biomedical 
sciences. One of the oldest government-organized industrial PhD 
schemes is run by Innovation Fund Denmark, which supports students 
who are simultaneously enrolled at a Danish university and employed 
(and paid) by a private-sector company. Melanie Sinche, director of 
education at the Jackson Laboratory for Genomic Medicine in Farm- 
ington, Connecticut, is enthusiastic about the idea of a vocational PhD 
at her institute, where it might fulfil a need for more expert computa- 
tional biologists. “The number of people qualified to do this is small, 
and there are lots of employers competing for this small pool of can- 
didates,” she says. 

But the split PhD could face challenges if the two tracks are valued in 
different ways: academics could view a vocational PhD as second-class, 
whereas tech companies could view an academic PhD as too abstruse 

for the real world. That could end up limiting the 
> NATURE.COM career options of doctorates rather than broad- 
Listentoapodcast ening them, says Hyman. Stanley counters that 
onthe future ofthe © EngD students do not have that problem. “A cou- 
PhD at: ple of students a year find their way back into 
go.nature.com/i8yh8f © academia to conduct research,” he says. 
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Some scientists call for more drastic measures — cutting down the 
number of people who pursue a PhD. 

Siphoning off more students into master’s programmes is one way 
to reduce PhD numbers, says Bruce Alberts, professor of biochemistry 
and biophysics in the department of medicine at UCSF. A master’s can 
offer advanced scientific training that is sufficient for many careers, as 
well as a taste of research, in one or two years rather than the four or 
five eaten up by a typical PhD. “In an ideal world, everyone would go 
in for a master’s,” Alberts says. 

Master's degrees are already common across Europe. In the Nether- 
lands, students are required to complete a master’s before embarking 
on a PhD. “There are many who don't want to be in academia who 
leave with a master’s to work in government institutions, companies, in 
publishing,” says Frank Miedema, professor and head of immunology 
at the University Medical Center Utrecht in the Netherlands. “Anda 
master’s is not considered a failure for those who 
cant make it toa PhD? 

Victoria Evans graduated with a master’s degree 
in astrophysics from Cardiff University, UK, in 
2012. “The research project in the master’s gave me 
an insight into what a PhD project would be like,” 
she says, “and I came to the conclusion that it wasn’t 
what I wanted to do” She now works as a nuclear- 
safety engineer for EDF Energy on the west coast 
of Scotland. “The problem-solving and analytical 
skills that I learned during my master’s were more 
than sufficient for me to work in this field” 

In the United States, the science master’s has 
often had a lower status than the PhD — but uni- 
versities are now launching more of them. Between 
2000 and 2011, the number of science and engi- 
neering master’s degrees available increased by 57%, compared with a 
38% increase in doctoral degrees, according to the US National Science 
Foundation. Part of that growth has been in the professional science 
master’s degree, a programme developed in the late 1990s as a graduate 
degree that would simultaneously develop scientific and workplace 
skills. Last year, Harvard Medical School introduced a two-year mas- 
ter’s in immunology aimed at students who want additional classroom 
and research experience to help them decide whether to continue on 
toa PhD or MD, or to transition to industry. 

But master’s programmes are no panacea. Unlike most doctoral 
students, master’s students in the United States and Europe are 
often required to pay for their tuition, and that could dissuade many 
from signing up. “This does create a social access problem,” said 
neuroscientist Eve Marder of Brandeis University in Waltham, Mas- 
sachusetts, at last month’s Future of Research meeting. 


Labour economists have been advocating for a reduction in the 
number of graduate students who enter biomedical sciences for several 
decades. Yet there is enormous resistance to change. That’s what the 
Quartet found, when it proposed gradually reducing the numbers of 
PhD students as part of its efforts to rescue biomedical research. “This 
idea has had the most opposition from our colleagues,” says Alberts. 
Faculty members and research institutions may be especially reluctant 
to give up the cheap workers who power their research when govern- 
ment funding for biomedicine has fallen, as it has in the United States 
for the past decade or so. And some scientists argue that fewer PhD 
graduates would bea loss to science and society as a whole. “The dra- 
conian measures of restricting access to graduate school is detrimental 
to science,” said Marder at the Future of Research meeting. “It means 
we would restrict the imagination in our workforce.” 

Cuts to PhD programmes haven't gone down well. When the 
Canadian Institutes of Health Research cancelled its 30-year-old 
MD/PhD programme earlier this year owing to budget tightening, 


“In an ideal 
world, 
everyone 
would go 
in fora 
master’s.” 
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academics and students reacted with horror. But other fields regu- 
late the flow of students into courses to match supply to demand. The 
American Bar Association, which oversees the legal system in the 
United States, attempts to regulate the number of qualified lawyers by 
exerting strict control over the number of law schools. And bar associa- 
tions set fiendishly difficult examinations for would-be lawyers to get 
into law school in the first place. 

Stiffer entrance assessments for those who want to pursue a PhD 
could cut down entrant numbers — ifthe right criteria can be found. In 
the United States, Graduate Records Examinations (GREs) are used as 
a way of selecting entrants for graduate school, but the system is hardly 
perfect: one survey showed that 37% of US biology PhD students drop 
out before completing their degree. When Orion Weiner, a molecular 
biologist at UCSE did a small, retrospective study of graduate students 
admitted onto one of his university's biology PhD programmes, he 
found that previous experience in research and the subject-specific 
GRE results (but not the analytical, verbal or quan- 
titative elements) were good indicators of future 
success in graduate school. 

A broader entrance assessment could look at 
students’ experience in communication, manage- 
ment, teamwork and career goals. That could be 
used to filter students with a passion for academic 
or industrial research towards PhD programmes 
and send others into a master’s or other types of 
training, says Bill Lindstaedt, executive director 
for career advancement at UCSF. 

Stephan believes that funding bodies should 
have a major role in limiting the number of bio- 
medical PhD places to better match supply and 
demand, and she also proposes that students 
should contribute to their training costs. “When 
we have to pay something out of pocket, we think a little more clearly 
about whether that is a good fit for us,” she says. Such ideas may be 
controversial — but many people say that they have to be considered. 

At the heart of the problem, say scientists, is that the community is 
not discussing the PhD problem enough. “There is a reluctance from 
supervisors to tell undergrads and grad students the reality of the sys- 
tem,” says postdoc McDowell. “The misinformation exists because 
the system is worried about deflecting smart people from entering” 
Although principal investigators acknowledge the difficulty of secur- 
ing an academic position, the system worked for them and so it is 
tempting to tell students that they can do it too — just another experi- 
ment, another publication or another year, and you'll get there. 

Grass-roots groups such as Future of Research are calling attention 
to the issue, as are efforts such as Rescuing Biomedical Research. 
Meanwhile, some experts say that the onus falls partly on prospec- 
tive and current PhD students to make sure their eyes are open. They 
should arm themselves with as much information as possible, says 
Labosky, so that “they are aware of their alternative options and can 
make plans”. 

Stephan does see some prospect that her call for PhD birth con- 
trol will be heard. She says that change might happen naturally, as 
more information becomes available on career outcomes, and that 
flat funding streams could prevent further growth in biomedical 
PhDs. “Individuals might become less focused on PhD production, 
and universities and faculty are more likely to pay attention to these 
recommendations.” 

Teitelbaum, for his part, does not favour a large cut in biomedical 
PhDs, and instead prefers a more considered approach. “Find out why 
people start PhDs and what they think their career prospects are from 
the very beginning,’ he says. “Like ballet dancers or actors, if they chose 
to take it on knowing their chances of becoming a successful professor, 
then let them carry on.” m SEE EDITORIALP.7 AND CAREERS P.155 


Julie Gould is an editor for Naturejobs. 
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THE BODY 


CIRIC 


RESEARCHERS WANT TO WIRE THE HUMAN BODY WITH SENSORS THAT COULD 
HARVEST REAMS OF DATA — AND TRANSFORM HEALTH CARE. 


BY ELIZABETH GIBNEY 


6ran Gustafsson looks at people and thinks of cars — the 
G ageing models that rolled off assembly lines a few decades 
ago. Today, says Gustafsson, cars are packed with cutting- 
edge sensors, computers and sophisticated communica- 


tions systems that warn of problems when they are still easy to fix, which 
is why modern vehicles rarely surprise their drivers with catastrophic 
breakdowns. 

“Why don’t we have a similar vision for our bodies?” wonders 
Gustafsson, an engineer whose team at the Swedish electronics company 
Acreo, based in Kista, is one of many around the world trying to make 
such a vision possible. Instead of letting health problems go undetected 
until a person ends up in hospital — the medical equivalent of a roadside 
breakdown — these teams foresee a future in which humans are wired 
up like cars, with sensors that form a similar early-warning system. 

Working with researchers at Linkdping University in Sweden, Gustaf- 
sson's team has developed skin-surface and implanted sensors, as well 


26 | NATURE | VOL 528 | 3 DECEMBER 2015 


as an in-body intranet that can link devices while keeping them private. 
Other groups are developing technologies ranging from skin patches 
that sense arterial stiffening — a signal of a looming heart attack — to 
devices that detect epileptic fits and automatically deliver drugs directly 
to affected areas of the brain. 

These next-generation devices are designed to function alongside 
tissue, rather than be isolated from it like most pacemakers and other 
electronic devices already used in the body. But making this integra- 
tion work is no easy feat, especially for materials scientists, who must 
shrink circuits radically, make flexible and stretchable electronics that 
are imperceptible to tissue, and find innovative ways to create inter- 
faces with the body. Achieving Gustafsson’s vision — in which devices 
monitor and treat the body day in, day out — will also require both new 
power sources and new ways of transmitting information. 

Still, the potential to improve health care substantially while reducing 
its costs has drawn both researchers and physicians to the challenge, 
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A. CHEZIERE/BEL/EMSE 


Surface sensors need 
to be as flexible and 
stretchy as the skin 
they are mounted on. 


says John Rogers, a materials scientist at the 
University of Illinois at Urbana~Champaign. “I 
haven't found any clinical folks who say “That's 
pie in the sky, come back to me in 20 years,” 
he says. “They say, “Wow, that’s cool. Here are 
three ways we can use it today, and how do we 
get started on a collaboration?” 

Sensors woven into the body are a natural extension of handheld 
smartphones and wearable devices, says Rogers (see Nature 525, 22-24; 
2015). “I think electronics is coming at you,” he says. “It’s migrating 
closer and closer and I think it’s a very natural thing to imagine that they 
will eventually become intimately integrated with the body,” 


SKIN DEEP 

The first step beyond wearables will be wireless sensors mounted 
directly on the skin, where they can pick up a host of vital signs, includ- 
ing temperature, pulse and breathing rate. Unfortunately, says Rogers, 
“biology involves bending, stretching and swelling’, which makes con- 
ventional electronics built from stiff silicon wafers a very poor choice 
for such sensors. 

His team has developed ‘epidermal electronics’: flexible, bio- 
degradable stick-on patches that are crammed with sensors but almost 
imperceptible to the user. Attached like temporary tattoos, the patches 
use normal silicon electronics, but thinned down and transferred to 
a flexible backing using a rubber stamp’. The patches draw power 
either from nearby magnetic fields or by harvesting radio waves, using 
S-shaped wires and antennas designed to stretch, twist and bend. “They 
adopt a wavy kind of geometry, so when you stretch, the wave shapes 
can change, like accordion bellows,’ says Rogers. 

Rogers has co-founded a spin-off company — MC10, based in Lex- 
ington, Massachusetts — that next year will start marketing versions of 
the device as BioStamps: temporary patches that measure heart electri- 
cal activity, hydration, body temperature and exposure to ultraviolet 
light. The patches will be available to consumers first, says Rogers, but 
his real target is medicine. Results are expected soon from a trial at the 
neonatal intensive-care unit at Carle Foundation Hospital in Urbana, 
where doctors are using the patches to monitor the vital signs of new- 
born babies without the need for intrusive cables and scanners. MC10 is 
also collaborating with Brussels-based pharmaceutical company UCB 
on tests ofa patch that monitors tremors in people with Parkinson's dis- 
ease, to track their illness and whether they are taking their medication. 

Rogers patches are relatively small, but at the University of Tokyo, engi- 
neer Takao Someya has created a sensor-laden electronic skin that can 
be made in much larger pieces”. His latest film is just 1 micrometre thick, 
and so light that it floats like a feather, yet it is robust enough to cope with 
the stretching and crumpling needed to flex with an elbow or knee. It can 
provide readouts on temperature — heat in a wound can signal infec- 
tion — moisture, pulse and oxygen concentration in the blood. Someya 
achieves this by ditching silicon altogether, and instead using inherently 
soft organic components made of carbon-based polymers and other mat- 
erials. Organic circuits can be printed onto a plastic film, making them 
cheap and easy to produce in large quantities. And they are versatile: they 
work in both high-temperature and water-based environments. 

Skin also inspires Zhenan Bao, an engineer at Stanford University 
in California. Her team creates thin pressure sensors by sandwiching 
micrometre-scale rubber pyramids between films’. Even a slight touch 
will compress the pyramids’ tips, changing how electric current flows 
between the films. The sensors can be used in heart monitors that track 
how fast pressure waves pass through arteries. This can reveal increased 
stiffness in the vessels — a predictor of heart attacks. Last year, the US 
Food and Drug Administration approved a wireless pressure sensor that 
can be implanted inside the hearts of people with advanced heart disease; 
Baos device could do a similar job from the surface of the skin. 

As useful as skin-mounted patches might be, much more information 
is available deeper in the body. “There’s a reason why at the hospital, 
they draw your blood,’ says Michael Strano, a chemical engineer at the 


FEATURE | NEWS 


Massachusetts Institute of Technology (MIT) in Cambridge. “There 
are markers in blood that are exquisitely good at predicting disease.” 

But delving deeper brings fresh challenges. Ideally, says Strano, sen- 
sors under the skin should be not only non-toxic, but also stable enough 
to function inside the body for years at a time if need be, and biocom- 
patible — meaning that they don’t trigger the body’s immune response. 
Yet most current devices fall short on one score or another. For exam- 
ple, sensors that detect chemical signals in the blood called biomarkers 
often use biological materials that degrade very quickly. This is a severe 
limitation for the advanced, real-time sensors that are currently used to 
monitor glucose in people with diabetes, says Strano: the devices detect 
glucose with an enzyme reaction that produces hydrogen peroxide. This 
degrades the sensors so quickly that they must be replaced within weeks. 

To get around that, Strano’ lab has developed synthetic, long-lived 
detector materials that can be mixed with a water-based gel and injected 
under the skin like a tattoo. The ‘ink for this tattoo consists of carbon 
nanotubes coated with dangling polymer strands, which have a lock-and- 
key chemical structure that recognizes biomarkers by dictating which 
molecules can dock with them*. When biomarkers bind to the polymer, 
they subtly change the optical properties of the nanotube: shine a light 
on the tattoo, and a glow reveals the presence of the biomarker. 

Strano and his team have developed carbon-nanotube sensors to 
monitor nitric oxide in blood’ — an inflammatory marker that can 
indicate infection or even cancer — and are working on glucose and 
cortisol, a stress biomarker that may prove useful for monitoring post- 
traumatic stress disorder and anxiety disorders. The nitric oxide sensor 
worked for 400 days in mice, which to Strano’s knowledge is the longest 
any implanted chemical sensor has been in place, and did so without 
provoking any immune response. For many other kinds of device, the 
jury is still out. “For electronic materials, especially plastic-based and 
organics, it’s still unknown what their long-term effects are,’ says Bao. 


‘WOW, THAT’S COOL.” 


Now ‘Strano is starting work with MIT engineer Daniel Anderson on 
devices that could combine sensors with drug-delivery systems. They 
hope to adapt microchips pioneered by fellow MIT engineer Robert 
Langer to respond to a range of triggers by releasing the appropriate 
drugs, encased in polymer capsules. The first human trial of a drug- 
delivering ‘pharmacy on a chip’ — without the sensors — was in 2012, 
in eight women with osteoporosis®. 

It will be a long time before such devices can be used to detect dis- 
eases reliably and treat them automatically, except perhaps for diabetes, 
which has been extensively studied. Strano’s devices are good at binding 
only with their target molecules, but big questions remain about what 
fluctuations in biomarker signals actually mean in terms of health, he 
says. His team is modelling biomarkers in the body, to help to decide 
where the sensor needs to be and how quickly it should to react to give 
useful information. “Often you need to rely on many different sensory 
parameters to make a decision. It’s not enough that one chemical is over- 
expressed,’ says Magnus Berggren, an electronic engineer at Linképing 
University who is collaborating with Gustafsson. 


MOVING TARGET 

Some researchers’ targets lie still deeper in the body, and for them, 
flexibility and biocompatibility are even more important. Ifa rigid 
sensor rubs against a moving organ such as the heart or the brain, in 
which the cells shift slight as the animal breathes, the body will quickly 
surround it with a wall of scar tissue. And if sensors move relative to 
the organ, the results will be unreliable in any case. 
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WIRED FOR LIF 


Sensors mounted on 
the skin are easy to 
apply and remove, and 
can obtain high-quality 
data on breathing, 
heart rate, blood 
pressure and other 
vital signs. But they 
must be flexible and 
stretchy enough to 
follow the natural 
movement of the body. 


Sensors injected under 
the skin can access 
the trove of 
information carried in 
the blood by chemical 
signals called 
biomarkers. The 
devices must be 
long-lived and 
biocompatible, so that 
they don’t trigger an 
immune response. 


Subcutaneous 
tissue 


Bioelectronics engineer George Malliaras at the Ecole Nationale 
Supérieure des Mines de Saint-Etienne in Gardanne, France, and his 
colleagues are among those developing flexible replacements for the 
relatively rigid sensors currently used to track distinctive electrical pat- 
terns in the brains of people with epilepsy or Parkinson's disease. Made 
of organic, conducting polymers, these flexible electronics respond to 
chemical signals — the flow of ions that generates the electrical patterns. 
This not only increases sensitivity, but also lets researchers “interface 
with biology in a wholly different fashion’, he says. 

The team’s latest device, tested in rats as well as in two humans under- 
going surgery for epilepsy’, has detected the firing of individual neu- 
rons, says Malliaras. And if the process is reversed, he adds, the sensor 
can be used to deliver drugs. Devices known as organic electronic ion 
pumps respond to an applied voltage by forcing drugs — small charged 
particles — out ofa reservoir. Working with the group at Linkoping 
University and the French National Institute of Health and Medical 
Research in Marseilles, Malliaras’s team is coupling his epilepsy sensor 
to an ion pump that responds to seizures by releasing epilepsy drugs into 
the correct part of the brain®. Berggren and the Linképing team have 
used a similar technique to develop a ‘pacemaker for pair’ that delivers 
analgesics directly to the spinal cord’. 


KEEP IT GOING 

Any electrical device is limited by its need for power. Devices that sit 
on or near the skin can incorporate antennas that harvest power wire- 
lessly — as long as an external source is nearby. But sensors deeper 
in the body often have to rely on batteries, which are bulky and need 
replacing. And some, such as Berggren’s pain-relief pump, need to 
have wires threaded through the overlying tissues — an arrange- 
ment that is both cumbersome and a potential route for infection 
(see “Wired for life). 

To get around such problems, Zhong Lin Wang, a nanoscientist at the 
Georgia Institute of Technology in Atlanta, has spent the past decade 
trying to harvest the tiny amounts of mechanical energy generated when 
people walk or even breathe. “We started thinking, how do we convert 
body motion into electricity?” he says. 

His latest design uses static electricity — long thought of as a nui- 
sance — to convert the movement of inhaling and exhaling into enough 
energy to power a pacemaker”. The generator uses two different poly- 
mer surfaces, sandwiched between electrodes and connected ina circuit. 
When the user breathes in and out, the surfaces touch and separate, 
swapping electrons — the same thing that happens when a balloon is 
stroked with a wool cloth. The build-up of charge causes current to flow 
through the wire. “Inhale and exhale, move back and forth or drive up 
and down and you generate power,’ says Wang. 

Starting in 2014, Wang began testing the system in rats, creating milli- 
watts of energy from a device the thickness of a few sheets of paper. Now 
his team is testing the same technology in pigs. 

Rogers’ team has created’ a biodegradable battery using electrodes 
made of magnesium and other metals that are safe in low concentrations 
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Sensors woven into the body could alert people to medical problems before they 
become seriously ill — if the devices can overcome some daunting challenges. 


*\, Flexible brain 


Epidermis SAneOr 


Devices implanted 
into the heart, brain 
or other deep 
tissues can gather 
data directly from 
the source and 
deliver drugs or 
stimulation exactly 
where needed. But 
they must have ways 
to get power in and 
data out — without 
resorting to wires. 


Flexible heart 
pacemaker 


an 


Spine-implanted 
ion pump 


Carbon-nanotube-based 
sensors 


and that slowly dissolve in the body. “Some devices you want to last the 
life of the patient. In others, you only need and want the device to be 
temporary,’ says Rogers. 


PERSONAL PRIVACY 

The technology could be revolutionary, but the vision of a wired-up 
body that sends data to an outside computer or medical centre faces 
a threat that already troubles the wearables industry: hacking. “When 
a semiconductor chip is introduced inside the body, hacking is a truly 
serious issue,’ says Someya. 

One solution is to analyse data on the device itself, reducing the 
amount that gets sent over the airwaves. Another is to avoid the airwaves 
altogether. In as-yet-unpublished work, the Swedish team has developed 
an in-body intranet that transmits signals at low frequency using the 
body’s water as its wires. To send information between devices, or froma 
device to a smartphone, users must physically touch the objects with their 
hands. This keeps the signals low-power and private, and avoids clogging 
up the data-transmitting frequencies that are already squabbled over by 
mobile phones and wireless routers. “It's only transmitted and exposed 
within your body,’ adds Berggren, who says that the system can already 
exchange data between electronically labelled objects through the body 
to asmartphone, and will soon integrate on-skin sensors. 

However good the devices, pioneers of new materials will also strug- 
gle against a tide of medical regulation, says Malliaras. That, along with 
the concerns of chemical suppliers who are afraid that failing devices 
could leave them vulnerable to lawsuits, “puts a big brake on the adop- 
tion of new materials’, he says. 

Berggren and his collaborators at Acreo are among the first to try 
to connect a range of devices by wiring up humans. But they readily 
acknowledge that making the vision a reality will require multiple 
companies and research teams, as well as the involvement of insur- 
ance companies and health-care providers. 

Berggren knows that there are big hurdles. “The challenge is to put 
everything together,” he says. “But they did it for the car industry and 
it’s impressive. You rarely see cars standing along the side of the road 
waiting for repair. Whether it’s possible to do this also for humans is still 
a question mark, but it’s definitely worth trying” 

Malliaras agrees. “A car you usually keep for less than ten years,’ he says. 
“A body you want to keep for 80 or 90 years; it’s a lot more precious.” m 


Elizabeth Gibney is a reporter for Nature in London. 
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A treatment plant in Chongqing, China, which processes 40,000 cubic metres of wastewater per day. 


Reuse water 
pollutants 


Extracting carbon, nitrogen and phosphorus from 
wastewater could generate resources and save energy, 
say Wen- Wei Li, Han- Qing Yu and Bruce E. Rittmann. 


reating domestic and industrial 
[eset so that it can be reused 

for drinking, irrigation and manu- 
facturing is costly. The treatment of used 
household water from cooking, washing, 
cleaning and sanitation alone accounts for 
3% of global electricity consumption and 
5% of global non-carbon dioxide green- 
house-gas emissions (mainly methane). 
Industrial wastewater is more expensive 
to clean. Those proportions will rise in the 


next decade as the world’s population grows 
and stricter water-quality standards are 
enforced by developing countries’. 

The costs could be more than recouped 
if valuable chemicals — including useful 
forms of carbon, nitrogen and phosphorus 
— were captured from wastewater. Water- 
treatment plants that harness methane could 
produce electricity rather than consume it’, 
for instance. Scaled up, emerging technolo- 
gies could efficiently and cheaply recover 
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phosphate and ammonium for fertilizer. 

What stands in the way of creating 
‘wastewater-resource factories’? Uncer- 
tainty’® — about which techniques are most 
useful and how to combine them. Here, we 
outline one possible strategy for domestic 
water (see ‘Wastewater works’), illustrating 
how treatment plants that now cost millions 
of dollars a year to run could be retuned to 
generate more than US$1 million a year for 
communities. Similar schemes applied to 
more diverse industrial wastewater would 
deliver further benefits. 


DOWN THE DRAIN 

Domestic wastewater contains the detritus 
of our daily lives — faeces, fat, food scraps, 
detergents and pharmaceuticals. In chemical 
terms, 1 cubic metre of domestic wastewater 
contains 300-600 grams of carbon-rich 
organic matter (known as carbonaceous 
chemical oxygen demand, or COD), 
40-60 grams of nitrogen (in the form of 
ammonium and organic compounds), 
5-20 grams of phosphorus (in phosphates 
and organic compounds), 10-20 grams of 
sulfur (mainly as sulfate) and traces of heavy 
metal ions. 

For the past century, the bulk of domestic 
wastewater has been treated using the aero- 
bic ‘activated-sludge process’: it is whisked 
with air and bacteria to oxidize the pollut- 
ants. The process is simple and is effective 
at removing organic compounds, nitrogen 
and phosphorus’. But it has a large energy 
and carbon footprint. A medium-sized 
plant (one that processes 100,000 cubic 
metres of water per day) consumes as much 
electricity as a Chinese town of 5,000 peo- 
ple (around 0.6 kilowatt-hours per cubic 
metre of wastewater) and emits as much 
CO, as 6,000 cars per day. 

The energy embodied in the waste- 
water’s organic matter is squandered. Also 
discarded are forms of nitrogen and phos- 
phorus that would be valuable for making 
fertilizers. Precipitated by adding calcium, 
iron or aluminum salts, 90% of the phos- 
phorus ends up buried in landfill because 
the precipitates cannot be taken up by plants 
and are often contaminated with toxic met- 
als*’. Likewise, more than 80% of the nitro- 
gen is lost through conversion to nitrogen 
gas by microbes. The process also produces 
a lot of ‘wet sludge’ (5-10 kilograms > 
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> per cubic metre of treated water). The 
drying and disposal (on land or in landfill) 
or incineration of this accounts for 30-50% 
of a treatment facility's overall costs. 

Some wastewater plants digest the sludge 
anaerobically. Here, microorganisms in the 
absence of oxygen break down complex 
organic matter into simpler organic mol- 
ecules’, which are then converted into meth- 
ane. By combusting the methane to produce 
electricity and heat’, anaerobic digestion can 
offset 20-30% of the energy and greenhouse- 
gas costs of the activated-sludge process. But 
digestion is slow, taking 10-20 days. 


PROMISING SYSTEMS 

Applying anaerobic practices directly to 
domestic wastewater could reverse those 
costs entirely and generate an excess of 
energy, but it is not currently possible at 
ambient temperatures and with low con- 
centrations of organics’. That could change 
with two new technologies being trialled — 
if they can be scaled up’. 

The first technology is the anaerobic mem- 
brane bioreactor (AnMBR). It uses a porous 
membrane to retain and concentrate solids 
(including particulate organic matter and the 
slow-growing microbes that produce meth- 
ane gas) and more than 90% of the dissolved 
organic matter in wastewater’. By prolong- 
ing the materials’ degradation time, it allows 
25-100% more methane to be produced per 
cubic metre of treated water. More than 90% 
of the dissolved methane (at concentrations of 
10-20 milligrams per litre) can be extracted 
with gas or vacuum techniques, using rela- 
tively little energy (less than 0.05 kilowatt- 
hours per cubic metre; kWh m°). 

Several pilot AnMBRs have been success- 
fully used for domestic wastewater treat- 
ment; a facility that can process 12 cubic 
metres per day at the Bucheon wastewater- 
treatment plant in South Korea has run for 
more than 2 years. The biggest challenge 
in scaling up this technology is preventing 
the membrane from becoming clogged, 
or ‘fouled’. Using gas bubbles or fluidized 
granular activated carbon to scour the 
membrane surface clean requires a further 
0.2-0.6 kWh m® of energy, comparable to 
that used in the activated-sludge process. 

A second option involves microbial 
electrochemical cells (MXCs) that either gen- 
erate electrical power directly, in the mode of 
microbial fuel cells, or produce energy-rich 
chemicals such as hydrogen gas in microbial 
electrolysis cells'°. MXCs take advantage of 
the ability of some bacteria that — as they 
metabolize organic matter — transfer elec- 
trons through their cell membranes to recep- 
tors outside. If passed to the anode of a fuel 
cell, the electrons can deliver a current. 

The products of MXCs — electricity 
or hydrogen gas — are more valuable and 
readily used than methane. But the reactions 


involved are slow (taking several days), 
notably the initial break-up of particulates, 
which account for half of the organic matter 
(COD) in domestic wastewater. A promis- 
ing possibility is integrating MXCs with 
an AnMBR to speed up the conversion of 
organic matter while producing methane 
and electricity or hydrogen”. 

But current MXCs perform poorly on 
large scales. Enlarging or stacking multiple 
cells increases their resistance and lowers 
the efficiency at which energy may be recov- 
ered. Several pilot, cubic-metre-scale facili- 
ties for domestic wastewater treatment have 
been reported, including: one using 120-litre 
microbial-electrolysis-cell cassettes, installed 
in Howdon, UK, that recovers less than half 
of the electrical energy input as hydrogen 
gas; and a 250-litre microbial-fuel-cell unit 
installed in Harbin, China, that converts 
only 7% of the embodied energy in organic 
substances to electricity. 


NUTRIENT RECOVERY 
What of nitrogen and phosphorus? Anaero- 
bic treatment releases them into the efflu- 
ent as ammonium and phosphate ions. 
The effluent can be used to irrigate nearby 
fields. But more valuable are nitrogen and 
phosphorus in forms that can be stored and 
transported. One option is recovering both 
as struvite, a slow-release fertilizer that is 
precipitated by adding magnesium and lime. 
This is commercially viable at the high phos- 
phate and ammonium concentrations (hun- 
dreds of milligrams per litre) found in sludge 
or livestock wastewater, but it is ineffective 
for domestic wastewater’. 

Two emerging technologies — ion 
exchange and electrodialysis — capture and 
concentrate phosphorus and nitrogen enough 


POLLUTANTS TO PROFITS 


Capturing energy, nitrogen, phosphorus and 

water can turn wastewater treatment from a 

major cost into a source of profit. 
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to be recovered from effluent as struvite®. In 
the first, phosphate ions are swapped with 
anions (such as carbonate) or ammonium 
ions swapped with cations (such as sodium 
ions) and adsorbed by materials such as iron- 
based hydroxides, zeolites and polymers. In 
the second, an electric field and membrane 
separate phosphorus and nitrogen ions from 
others on the basis of charge and size. 

Both technologies are still being debugged 
on small scales. Problems include incomplete 
recovery of ions from the exchanger; the 
exchanger or membrane becoming blocked 
by organic matter; salts contaminating the 

concentrate; and 


“Nitrogen cost. For example, 
recovery from membranes cur- 
wastewater in rently cost hun- 
particular would dreds of dollars 
have a global per square metre. 
impact.” And electrodia- 


lytic extraction (at 
a recovery rate of 90%) of phosphorus and 
nitrogen consumes roughly 0.23 kWhm~* 
and 0.14kWhm”®, respectively — around 
two-thirds of the energy consumed in the 
activated-sludge process*. Use of MXCs may 
partly offset that energy input by generating 
electricity, but microorganisms and biomol- 
ecules aggravate membrane fouling”. 
Nitrogen recovery from wastewater in 
particular would have a global impact. In 
the lab, extraction of nitrogen has received 
less attention than has phosphorus extrac- 
tion, because atmospheric nitrogen gas can 
be easily reduced to synthesize nitrogen 
fertilizer. But the process involved — the 
nitrogen-fixing Haber—Bosch process — is 
energy intensive: it accounts for a few per 
cent of the world’s annual energy use. Sub- 
stituting just 5% of the existing nitrogen- 
fertilizer production would save more than 
50 terawatt-hours of energy, or 1.5% of Chi- 
nas annual electricity consumption. 
Biosolids — biomass from microbial 
growth and undigested faeces, fibres and 
other solids from the wastewater — are other 
by-products of anaerobic digestion that con- 
tain nitrogen and phosphorus. If they are 
stabilized (to avoid generating methane gas 
or odours) and detoxified (no pathogens or 
hazardous chemicals) during anaerobic treat- 
ment, they can be applied directly to the soil’. 
The United States spreads 55% of its treated 
biosolids onto the land, but this practice is 
under public and regulatory pressure because 
the waste is difficult to stabilize and detoxify 
completely, and heavy metals accumulate. 
Heat treatment makes biosolids easier 
and safer to use. It kills pathogens, improves 
nutrient retention and lessens heavy-metal 
release. Heat from combusted methane 
can be used to lower energy needs’, but the 
safety of biosolid products still needs to be 
improved and evaluated at larger scales. 
The final product — water — has huge 
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WASTEWATER WORKS 


Extracting carbon, nitrogen and phosphorus compounds from used 
water using a series of reactors would transform treatment plants into 
profitable sources of energy, fertilizer and clean water. 


A bed of activated carbon 
(1) and a membrane (2) 
trap organics and 
slow-growing anaerobic 
microorganisms, which 
convert the organics into 
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economic value: the global average price 
for potable water is $2 per cubic metre. 
Each type of use requires water of a differ- 
ent quality — from the cleanest for drinking 
to lower-quality water for cooling or indus- 
try uses. The treatment technology needed 
varies accordingly. In China, only 15% of 
treated water is reused and up to 98% of 
potable water goes to municipal and indus- 
trial sectors that could make do with lower- 
quality water. A ‘fit-for-purpose’ treatment 
and reuse strategy is needed. 


ECONOMIC BENEFITS 

We estimate that a domestic wastewater- 
resource factory serving a city of about halfa 
million people in China would treat around 
100,000 cubic metres of domestic wastewater 
per day. We calculate that each day it could 
produce around 17,000 kWh of electrical 
energy, recover 1 tonne of phosphorus and 
5 tonnes of nitrogen, and reclaim 1,000 cubic 
metres of potable water. By contrast, an acti- 
vated-sludge plant (with anaerobic digestion) 
of the same size would consume 50,000 kWh 
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of electrical energy and recover no phospho- 
rus or nitrogen. A resource factory would 
thus save 67,000 kWh per day (and that is 
without considering the energy saved in fer- 
tilizer production). This is equivalent to 1.5% 
of the city’s daily electricity consumption. 

We estimate that such a factory could yield 
a profit of $1.8 million per year (excluding 
construction costs), compared with a cost of 
$4.6 million per year for an activated-sludge- 
treatment plant (see ‘Pollutants to profits’). 
That assumes the sale of only the 1% of water 
made drinkable; profits could be ten times 
higher if non-potable water were sold. 

The economic boon could be higher still 
for industrial wastewaters in the agricultural, 
food and petrochemical sectors’. For exam- 
ple, AnMBRs can remove up to 98% of the 
organic matter (around 18 kilograms per 
cubic metre) from petrochemical effluent, 
producing 100 times more methane than is 
achievable with domestic wastewater. Live- 
stock wastewater is rich in organic molecules 
and phosphorus, making it an important 
potential source of energy and fertilizer’. 
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Government support will be crucial to 
developing wastewater-resource factories and 
promoting a sustainable water-resource mar- 
ket. For the next decade, extracting resources 
from wastewater will remain expensive rela- 
tive to fossil-fuel energy and current process- 
ing methods. Why? Because environmental 
costs are not yet factored into pricing and 
emerging recovery technologies have not yet 
benefited from economies of scale. Priorities 
will change as energy, resource and global- 
warming stresses intensify. 

What next? Governments must establish 
regulatory frameworks that include the costs 
of waste disposal and greenhouse-gas emis- 
sions. They must invest in demonstrations at 
scale of the pre-commercial or early-adopter 
technologies; initially subsidize the sales of 
recovered products; and promote the benefits 
of the recycled-resource concept. 

Governments and enterprises in the sector 
should provide targeted research funds as 
well as land and infrastructure. To ensure 
that the products are suitable, technologi- 
cal development must involve input from 
regulators, managers of wastewater facilities, 
engineers, researchers and the public. 

National initiatives are needed that suit 
local environmental, economic and social 
conditions. Industrialized countries should 
integrate the emerging processes when 
they replace ageing treatment facilities. 
And emerging economies such as China 
and India should incorporate them as they 
expand their water-treatment capacities. m 
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Han-Qing Yu is professor of wastewater 
systems and sustainability at the Chinese 
Academy of Sciences’ Key Laboratory of 
Urban Pollutant Conversion, University 
of Science & Technology of China, Hefei, 
China. Bruce E. Rittmann is professor of 
environmental engineering and director 
of the Swette Center for Environmental 
Biotechnology, Arizona State University, 
Tempe, Arizona, USA. 
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Most soils are in private ownership, making it tricky to implement binding international agreements. 


Govern our soils 


Luca Montanarella calls for a voluntary international agreement to protect 
the ground beneath our feet from erosion and degradation. 


ighty years ago, in 1935, soils were for 
Ee: first time officially recognized asa 

limited national resource that should 
be responsibly managed. In the wake of the 
catastrophic erosion that caused the infa- 
mous Dust Bowl drought, the US govern- 
ment passed the Soil Conservation Act. “The 
history of every Nation is eventually written 
in the way in which it cares for its soil,” wrote 
President Franklin D. Roosevelt. 

Roosevelt's act was largely successful. It 
encouraged farmers to apply sustainable 
management practices — such as tilling 
less, installing windbreaks, and planting 
along slope contours’. Between 1982 and 
2007, soil erosion in US cropland declined 
by 43% (ref. 2). 

The history now being written in the 
world’s soils is not so rosy. Every year, 
75 billion tonnes of crop soil are lost 
worldwide to erosion by wind and water, 
and through agriculture; this costs about 
US$400 billion a year’. Only a few coun- 
tries have national legislation protecting 
soil, including Germany and Switzerland’. 
Attempts at binding international legal 


agreements have so far failed. 

This cannot go on. Soils are a limited 
natural resource, unequally divided between 
nations and people. They provide fertilizer 
for growing food; store and filter water; 
host rich ecosystems, including many little- 
known species; provide resources such as 
peat, sand, clay and gravel; and hold our cul- 
tural and historical memory in archaeologi- 
cal artefacts. The ground beneath our feet is 
a public good and service. 


GET OFF MY LAND 
Without governance to assure wise manage- 
ment and equitable access, we are heading 
towards increased poverty, hunger, conflict, 
land grabs and mass migration of displaced 
populations, such as that seen during the 
Great Depression’. The world now stands 
at a moment of opportunity. A Global Soil 
Partnership (GSP) exists, and could imple- 
ment a voluntary system of global govern- 
ance. But the GSP needs to develop clear, 
concrete proposals for action to secure more 
funding and move forwards. 

International soil governance faces great 
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challenges. Take, for example, a nearly 
decade-long attempt by the European Union 
to implement a governance framework. A 
team at the European Commission (of which 
I was part) developed a common EU strat- 
egy for soil protection’ including a proposed 
EU Soil Framework Directive, which would 
have obliged member states to take action 
to prevent soil degradation. It was the result 
of several years of consultations in special- 
ized working groups that included scientists, 
policymakers, industry representatives, 
landowners and farmers, as well as con- 
cerned non-governmental organizations 
(NGOs) and other stakeholders. Much was 
at stake, including the ongoing, costly reme- 
diation of more than 3 million contaminated 
sites in Europe, such as old industrial areas 
and mining sites, and the question of who 
should pay. 

Several EU member states opposed the 
directive. Their arguments were much the 
same as those used in 1935 by opponents 
to the US Soil Conservation Act. They 
countered that soils are a strictly local issue, 
and should be governed locally rather than 


CORBIS 


BETTMANN/CORBIS 


by a central authority (the subsidiarity 
principle). They noted that because most 
soils are privately owned, they should not 
fall under the remit of public governance, 
and pointed out that soils do not move, and 
therefore there is no need for transnational 
or global governance instruments. After 
some debate and a long period of apathy, the 
directive was withdrawn by the European 
Commission in May 2014. 

The counter-argument is simply that 
good-quality soils are necessary for the 
food, fibre and fuel of a growing popula- 
tion. That makes soil — like air and water 
—a shared resource that requires govern- 
ance. And, because most soils are indeed 
privately held, legally binding international 
agreements are unrealistic. Instead, govern- 
ance must be based on voluntary efforts by 
national governments, local land owners and 
administrations. 

Progress so far has been disappointing. 
In 1982, the Food and Agriculture Organi- 
zation of the United Nations (FAO) adopted 
a World Soil Charter with 13 recommen- 
dations for sustainable soil management. 
It enshrines some basic principles such 
as: “the use of these resources should not 
cause their degradation or destruction 
because man’s existence depends on their 
continued productivity”. That charter was 
endorsed by all mem- 


bers of FAO (nearly “Soils are 

all national govern- necessary 

ments). It remains for the food 

largely ignored. fibre and fuel 
The dramatic rise of a growing 


in food prices during 
the 2008 global food- 
commodities crisis 
finally raised the attention of policymakers. 
That led to the creation in 2011 of the FAO’s 
GSP: a voluntary body tasked with finally 
enacting the soil charter’s principles. 


population.” 


TIME FOR LEGISLATION 
The GSP has concentrated its activities 
on promoting sustainable management of 
soils, for example by encouraging consist- 
ent research, education and good policy. 
In 2016, it will launch a World Soil Prize 
to reward best practice. Concrete action 
on the ground is in the hands of Regional 
Soil Partnerships that include all local 
stakeholders. So far, most of the GSP’s 
work has been in organizing conferences 
and developing task-force plans of action. 
Sadly, these mostly provide vague expres- 
sions of intent. Four years after its crea- 
tion, the GSP is under increasing pressure 
from NGOs and funders to deliver results. 
The GSP’s clearest call is for the develop- 
ment of a Global Soil Information System. 
Unfortunately, the GSP failed to establish 
a comprehensive partnership with every- 
one involved, and as a result several parallel 


independent projects have emerged, such 
as the GlobalSoilMap.net consortium and 
the Global Soil Information Facilities. 
Bringing all of these efforts together will 
be difficult. 

To underpin the GSP, an Intergovern- 
mental Technical Panel on Soils (ITPS; of 
which I am chair) was established in June 
2013. Like the Intergovernmental Panel on 
Climate Change, the ITPS aims to provide 
scientific and technical guidance to policy- 
makers. It is composed of 27 soil experts 
from across the seven FAO regions. Our 
ambition is to serve the GSP and all soil- 
related multilateral environmental bodies, 
such as the United Nations Convention to 
Combat Desertification, the Convention 
on Biological Diversity and the United 
Nations Framework Convention on Cli- 
mate Change. 

The main product of the ITPS’s first two 
years is the Status of World’s Soil Resources 
report, scheduled for release at the closing 
ceremony of the UN International Year of 
Soils in December 2015. The report, the first 
comprehensive assessment of global soil 
resources, is the collaborative effort of more 
than 200 scientists. It highlights serious con- 
cerns such as nutrient imbalance: some parts 
of the world suffer from an excess of fertilizer 
use, whereas much of the developing world 
suffers from a severe lack of fertilizers. The 
ITPS is preparing practical recommenda- 
tions for reversing these trends. 

The GSP is the best current option for 
driving forward those recommendations, 
despite its shortcomings. The partnership 
needs to motivate all invested parties to 
develop commitments to specific actions. 
These should enshrine soil management 
in legislation tailored to each country’s 
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The 1930s Dust Bowl drought prompted the first soil-conservation act, in the United States. 


needs. The GSP needs to prove that it can 
be more than just a talking shop, and can 
generate political will and raise funding. 
The FAO has suggested an initial budget 
of $64 million over five years for the GSP’, 
mainly to help to develop the Global Soil 
Information System and to promote train- 
ing and capacity building in developing 
countries. So far, less than 10% of that has 
been raised from donors, mainly the Euro- 
pean Commission. 

Increasingly, people speak of ‘soil 
security’, in analogy with food and water 
security. In a world facing increasing stress 
from a growing, hungry population and 
changing climate, soils will become ever 
more important. m 


Luca Montanarella is a senior expert at 

the Joint Research Centre of the European 
Commission in Ispra, Italy, and chair of the 
Intergovernmental Technical Panel on Soils. 
e-mail: luca.montanarella@jrc.ec.europa.eu 
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GEOPHYSICS 


Port-au-Prince after the devastating 2010 earthquake that killed more than 85,000 Haitians. 
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Vast forces underfoot 


Andrew Robinson examines three books that see seismicity as both grimly 


destructive and, in some contexts, culturally energizing. 


blink in geological time — 150 years 
A- has passed since Jules Verne 
published his fantasy Journey to the 

Centre of the Earth (see D. Chatelain and 
G. Slusser Nature 513, 169-170; 2014). Half 
a century later, geologist and meteorologist 
Alfred Wegener published his radical theory 
of continental drift, The Origin of Continents 
and Oceans (see T. Nield Nature 526, 192- 
193; 2015). And halfa century after that, in 
1965, the theory of plate tectonics — partly 
inspired by Wegener — was established by 
geophysicist John Tuzo Wilson among others. 
In the subsequent half-century, explora- 
tion of the Solar System has revealed that 
Earth is the only planet in it with a global 
system of plate tectonics. Satellites in the US 
Global Positioning System monitor plate 
movements with an accuracy of a few mil- 
limetres. But our understanding of Earth's 


Journey to the Centre of the Earth: The 
Remarkable Voyage of Scientific Discovery 
into the Heart of Our World 

DAVID WHITEHOUSE 

Weidenfeld & Nicolson: 2015. 


Earthquake Time Bombs 
ROBERT YEATS 
Cambridge University Press: 2015. 


Impact of Tectonic Activity on Ancient 
Civilizations: Recurrent Shakeups, Tenacity, 
Resilience, and Change 

ERIC R. FORCE 

Lexington: 2015. 


mantle and core is much less advanced. 
The deepest borehole penetrates just 12,262 
metres, two-thousandths of Earth’s radius. 
Reliant mainly on seismographic monitor- 
ing, modelling and post-quake analysis, 
geophysicists and seismologists remain 
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perplexed about the exact structure of the 
inner core and the precise cause of earth- 
quakes. Developments in seismology since 
the 1930s, when Charles Richter invented his 
local-magnitude scale, have ranged from lab- 
oratory fault-friction experiments to global 
seismic tomography. But our ability to pre- 
dict the timing, location and magnitude of 
earthquakes has scarcely progressed. 

Now, three books examine earthquakes 
from distinct angles. In his Journey to the 
Centre of the Earth, astronomer and BBC 
science broadcaster David Whitehouse takes 
the reader on a scientific journey from crust 
to core in a book inspired by Verne’, but mak- 
ing slight reference to it. Seismologist Robert 
Yeats, in Earthquake Time Bombs, focuses 
on the crust, and how to protect vulnerable 
conurbations — his “time bombs” — from 
probable seismic shocks. And geologist 
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> and geoarchaeologist Eric Force investi- 
gates earthquakes from the third millennium 
BC in Impact of Tectonic Activity on Ancient 
Civilizations, theorizing that they stimulated 
trade and helped to shape civilizations. 
Whitehouse’s account is the most readable 
and wide-ranging, although it is inevitably 
speculative. “We will reach the distant stars 
before we reach the centre of the Earth,” 
he writes, after descending more than 
1,000 metres into one of the deepest mines in 
Europe, the Boulby potash mine in northeast 
England. He is also the most adept at mix- 
ing the history of Earth science — beginning 
with Edmond Halley’s maritime expedition 
to measure Earth’s magnetic field around 
1700 — with comments by current research- 
ers. One of them admits that “everything” 
about the inner core — structure, anisotropy, 
topography and dynamics — “is getting 
increasingly complex as we get more data’. 
However appealing, Whitehouse’s account 
contains errors. For instance, the disastrous 
1906 San Francisco earthquake, in which 
more than 3,000 people died, did not prompt 
the introduction of “building and emergency 
regulations”; those came decades later in 
California. Indeed, San Franciscans did their 
best to blame the city’s destruction on the fire 
started by the earthquake and to carry on with 
‘business as usual. Nor did seismologist Beno 
Gutenberg, mentor of Richter, flee persecu- 
tion in Germany in 1933 for a job at the Uni- 
versity of California. He left in 1930, before 
Adolf Hitler came to power, and settled at the 
California Institute of Technology in Pasa- 
dena, along with a visiting Albert Einstein. 
Yeats’s book is a follow-up to his mag- 
num opus, the specialist Active Faults of the 
World (Cambridge University Press, 2012). 
Earthquake Time Bombs aims to reach a 
wider audience. Writing that the “next great 
earthquake will be a disaster, but failing to 
prepare for it will lead to a catastrophe’, he 
recounts how in early 2010 he told a Scien- 
tific American reporter that Port-au-Prince 


should be regarded as a time bomb. Its swell- 
ing population occupied dilapidated slums 
adjacent to a plate-boundary fault that had 
not sustained a major earthquake since the 
mid-eighteenth century. A week after he 
made his comments, a magnitude-7 earth- 
quake destroyed the Haitian capital, killing 
at least 85,000 people (the government put 
the figure at more than 300,000), mainly as 
a result of inadequate and corrupt building 
practices (R. Bilham Nature 502, 438-439; 
2013). Yeats had not, of course, predicted the 
quake. He was simply aware of research on 
the fault published in 2008 by Eric Calais and 
his colleagues (D. M. Manaker et al. Geophys. 
J. Int. 174, 889-903; 2008) These research- 
ers had privately alerted the Haitian govern- 
ment, but advised that they could not predict 
the timing of the recurrence. 

Sixty of the world’s largest cities lie on 
plate boundaries and are at risk from inter- 
plate earthquakes. Yeats duly discusses the 
usual suspects, such as San Francisco, Tokyo, 
Istanbul and Santiago. But he also explores 
less familiar threats, including the Cascadia 
subduction zone (Seattle, Portland and Van- 
couver), where he lives, along with Tehran, 
Kabul, parts of the Himalayan region, Manila, 
Caracas, Wellington and the East African Rift 
Valley. Surprisingly, he neglects the hazard 
from intraplate quakes that occur away from 
boundaries, including that in Gujarat, India, 
in 2001 and, most famously, the 1811-12 
earthquakes in Missouri in the middle of the 
North American plate. The Missouri quakes 
have provoked much debate among lead- 
ing US seismologists such as Susan Hough 
and Seth Stein, author of Disaster Deferred 
(Columbia University Press, 2010), a book 
that Yeats does not mention. He concludes 


with convincing grim- 
> NATURE.COM ness that only Califor- 
Formoreonscience _ nia, Japan, Chile and 
in culture see: New Zealand have 
nature.com/ taken the earthquake 
booksandarts hazard seriously. 
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Where Yeats expounds on the destructive 
power of quakes, Force posits that they may 
have rocked the cradles of past civilizations. 
High tectonic activity has accompanied the 
birth and growth of many ancient civiliza- 
tions in the Middle East, Greece and Italy 
and, to a lesser extent, the Indus Valley and 
China. During the second and first mil- 
lennia Bc around the Mediterranean Sea, 
the Minoan, Mycenaean, Greek, Etruscan 
and Roman civilizations arose during eras 
of major seismic activity in their regions. 
No comparable cultures developed on the 
relatively inactive coasts of Spain, France 
and Libya, observes Force. He suggests that 
frequent tectonic activity was a “long-term 
cultural stimulant’, forging ancient com- 
munities that were resilient, cooperative, 
innovative and outgoing, and where “elders 
would be passing on an expectation of 
change to younger generations”. 

It is a tantalizing thesis, which Force pur- 
sues tenaciously and with considerable skill. 
However, the book perhaps goes too far in 
its claim for the dominance of seismic activ- 
ity in the development of civilizations. Ifthe 
hypothesis is correct, how did Egypt, which 
had (and has) relatively low tectonic activity, 
produce a major civilization? Surely, climate, 
coasts, rivers, fertile soil and supplies of 
water, minerals, building materials and fuel 
are also key, even if some of these factors are 
also influenced by tectonic activity. 

Nevertheless, Force's speculation remains 
an intriguing possibility. Today, both Silicon 
Valley and Hollywood lie on the San Andreas 
fault. Just a coincidence? Or is the hidden 
cause of these powerhouses of imagination 
and innovation that the region has frequently 
been “all shook up”? m 


Andrew Robinson is the author of three 
books on earthquakes, including the 
forthcoming Earth-Shattering Events: 
Earthquakes, Nations and Civilization. 
e-mail: andrew.robinson33@virgin.net 
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SCIENCE GALLERY AT TRINITY COLLEGE DUBLIN 
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The scarred Self 


Anthony King reviews an exhibition on the horror, and hope, posed by trauma. 


ess than a month after the terrorist 
Le in Paris, Trauma, the latest 

show at Science Gallery Dublin, feels 
unnervingly relevant. Exploring, among 
other things, the surprising role of trauma 
in emotional resilience, this collection of 
objects and ideas ranges from haunting 
photos from Northern Ireland’s Troubles to 
a room-sized instrument crafted by a com- 
poser with tinnitus (see J. Hoffman Nature 
505, 159; 2014) and photograms of plant 
specimens from Chernobyl. 

Trauma, from the Greek word meaning 
‘wound, is profoundly personal: the suf- 
ferer is sealed off within the mental or 
physical experience. Katharine Dowson’s 
Memory of a Brain Malformation, a deli- 
cate laser etching of a brain tumour in glass, 
emphasizes this isolation. Dowson — who 
often works with scientists and physicians 
— portrays the growth as a discrete entity 
inside a nest of sinuous veins. (The actual 
tumour was successfully removed from her 
cousin's brain by laser treatment.) The work 
evokes both the emotional trauma of diag- 
nosis and the energy of a positive outcome. 

External trauma to the head can be just 
as damaging, and the neurological prob- 
lems arising from it in sport are the focus 
of intense research. The installation Impact 
examines the design of helmets for sports 
such as American football and Irish hurl- 
ing, sparked by such studies. Mechanical 
engineer Ciaran Simms at Trinity College 
Dublin, for instance, examines the body’s 
response to high-force impacts in rugby; 
Stefan Duma at the Virginia Polytechnic 
Institute and State University in Blacksburg 
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Trauma: Built uses real-time sensors 


to Break in the field and lab to 
SCIENCE GALLERY rate commercial hel- 
DUBLIN met design. 


Until 21 Feb 2016. ‘A darker veqlniat 


trauma is explored on 
the gallery’s ground floor in The Interroga- 
tion of Detainee 063, an infographic detail- 
ing 50 harrowing days in the interrogation 
of Mohammed al-Qahtani at the United 
States’ Guantanamo Bay camp in Cuba. The 
exhibit underlines the extreme suffering 
triggered by torture. Colour-coding shows 
the duration of interrogation, loud music 
and inhumane and degrading treatments, 
such as being forced to wear a muffling 
hood or humiliating signs, or to write letters 
of apology to victims of the terror attacks of 
11 September 2001. 

Upstairs, Stressed Body, Stressed Brain 
investigates one physiological response 
that is central to the notorious torture 
technique waterboarding. This response, 
the diving reflex, is triggered when the 
face is immersed in cold water. The exhibit 
invites viewers to lie down and have a 
damp cloth placed on their cheeks to 
gauge how this upsets memory recall and 
slows heart rate by as much as one-quarter. 
Its curators are physiologist Aine Kelly and 
neuroscientist Shane O’ Mara, author of 
Why Torture Doesn’t Work (Harvard Uni- 
versity Press, 2015; see L. T. Harris Nature 
527, 35-36; 2015). 

O’Mara has shown elsewhere how stress 
and trauma can trigger the creation of false 
memories. Memory Laundering — essen- 
tially a large cabinet holding dozens of 

deposit boxes — plays with this muta- 
bility. Created by makers Design- 
goat, It is inspired by the work of 
neuroscientist Susumu Tonegawa 
and the team at the RIKEN-MIT 
Center for Neural Circuit Genet- 
ics at the Massachusetts Institute 
of Technology in Cambridge, who 
collaborated with the gallery. You 
are asked to write down one good 
and one bad memory, and place 
them in one of the boxes. When 


Katharine Dowson’s laser etching 
Memory of a Brain Malformation records 
her cousin’s brain tumour in glass. 


BOOKS & ARTS | COMMENT | 


you return to retrieve them, the details have 
been edited by a gallery mediator concealed 
behind the cabinet. 

The most graphic of the exhibits is a 
series of photographs of an operating the- 
atre in Afghanistan’s Helmand Province. 
Sightlines I/Supernumerary by installation 
artist David Cotterrell is a record of his 
stint as an embedded photographer with 
the UK Joint Forces Medical Group. Dur- 
ing it, he created diptychs and triptychs of 
the visceral business of emergency medi- 
cine — containing and controlling trauma. 
The images reference the dramatic chia- 
roscuro of painters such as Caravaggio, 
evoking horror yet suggesting sublime 
beauty. Alongside the bloody collage 
stands XSTAT 30 Hemorrhage Control 
Device, an innovative syringe contain- 

ing 92 miniature 


“Trauma is cellulose sponges, 
something designed to control 
that life can severe bleeding. 

profit from, “Trauma is the 
enhancing ultimate insult,” 


resilience.” says co-curator and 
neuroscientist Dan- 
iel Glaser. It is about not the moment, but 
the aftermath, he adds — a truth easily 
observable among victims from Syria to 
France. Dublin has had its own share of 
trauma. Next year sees the centenary of the 
Easter Rising, when Irish republicans pro- 
claimed independence from Britain. Six 
locations across the city will be stamped 
with a bandage symbol on a map avail- 
able at the gallery as part of artist Sarah 
Bracken’s Bandage, marking hidden scars 
from pitched street battles, arrests and 
executions by the British military during 
the rising. 

At base, this is a show about recovery. 
The tumour is excised; the blood flow 
is staunched; life goes on. From mental 
wounds to psychological damage — his- 
torically viewed as inevitable aspects of the 
human condition — the message in Trauma 
is ultimately positive. O'Mara explains why. 
Between 30% and 70% of traumatized peo- 
ple experience post-traumatic growth, he 
writes: their suffering opens up “new per- 
spectives not previously available to them”. 
As he notes, “trauma is something that life 
can profit from, enhancing resilience, and 
providing lessons to us all”. = 


Anthony King is a writer based in Dublin. 
e-mail: anthonyjking@gmail.com 
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It is rational to 
protect Antarctica 


We are dismayed that the 
international commission that 
oversees the Convention on 
the Conservation of Antarctic 
Marine Living Resources has 
voted against establishing 
marine protected areas (MPAs) 
around Antarctica for the fifth 
consecutive time. These MPAs 
are designed to protect wildlife 
hotspots of world significance. 

The main opponents were 
member states that fish or intend 
to fish for toothfish (Dissostichus 
spp.) and Antarctic krill 
(Euphausia superba). Toothfish, 
which are sold as Chilean sea 
bass, are the top fish predators 
in the Southern Ocean; krillisa 
crucial component of the marine 
food web that is sold as fishmeal 
and for fish-oil pills. 

The convention's goal of 
conservation is being marred 
by some member states who are 
misinterpreting the “rational 
use” proviso in its text. Originally 
intended to allow fishing in 
the Southern Ocean only if it 
complied with strict guidelines, 
this term is being misinterpreted 
as an unrestricted right to fish 
and as an excuse to block tighter 
regulations (see J. Jacquet et al. 
Mar. Policy 63, 28-34; 2016). 

The commission operates by 
consensus, so a single member 
state can prevent cooperation. 
This year, China and Russia 
blocked the proposed MPAs for 
the east Antarctic — even though 
these included boundaries 
designed to accommodate 
fisheries — and Russia blocked 
an MPA in the Ross Sea. 
Jennifer Jacquet New York 
University, New York, USA. 
Cassandra Brooks Stanford 
University, California, USA. 
jacquet@nyu.edu 


Mining disaster: 
huge species impact 
On 5 November, a huge mudflow 


contaminated with iron ore from 
mine workings was released into 


the Rio Doce river in southeast 
Brazil after two dams broke. 
Immediate action is necessary 
to evaluate the massive human 
and ecological impact of this 
catastrophe, and there must be 
a concerted effort to prevent 
further such incidents. 

As well as killing several 
people, the accident threatens 
the water supply of many large 
cities downstream that are 
already severely limited by a 
long-standing drought. The 
polluted river runs through the 
Atlantic rainforest and is likely to 
damage the exceptional endemic 
fauna and flora in its waterways. 

Of the 71 recognized fish 
species in the river, 11 were 
considered endangered before 
the mud slide (see go.nature. 
com/zmry1z; in Portuguese). 
The accident also interrupted 
reproductive migrations for 
many of these species. 

Markus Lambertz Zoological 
Research Museum Alexander 
Koenig, Bonn, Germany. 

Jorge A. Dergam Federal University 
of Vicosa, Minas Gerais, Brazil. 
m.lambertz@zfmk.de 


Mining disaster: 
restore habitats now 


In Brazil’s Atlantic rainforest 
region last month, cities 
were flooded and watersheds 
contaminated when some 
50 million cubic metres of heavily 
polluted water was released from 
an iron-ore tailings pond. The 
mining company responsible and 
Brazil's environment ministry 
should act swiftly to mitigate the 
human and ecological damage. 

The release has deprived some 
500,000 people of their water 
supply. It is likely to damage the 
entire ecological network through 
chemical pollution, reduced 
oxygen availability and high 
turbidity, further threatening the 
region’s status as one of the world’s 
biodiversity hotspots. 

Authorities will need to 
collaborate with universities 
on ecosystem restoration and 
revitalization projects. 


Jhonny Capichoni Massante 
Federal University Fluminense, 
Niter6i, Rio de Janeiro, Brazil. 
jcmassante@id.uff.br 


Star universities in 
the Muslim world 


As former chairman of Pakistan's 
Higher Education Commission 
and former coordinator-general 
of the Organisation of Islamic 
Cooperation’s science and 
technology body COMSTECH, 
I suggest that some universities 
in the Muslim world are not in 
such dire need of revitalization 
as Nidhal Guessoum and Athar 
Osama imply (Nature 526, 
634-636; 2015). 

At least 3 such institutions 
are ranked in the world’s top 250 
— the University of Malaya 
in Kuala Lumpur, and King 
Fahd University and King Saud 
University, both in Saudi Arabia 
(see go.nature.com/4gfu2u). In 
2013 and 2014, the Middle East 
Technical University, Istanbul 
Technical University and Bilkent 
University in Turkey were 
ranked in the top 400 globally 
(see go.nature.com/m6195d). 
Pakistan's National University of 
Sciences and Technology and the 
Pakistan Institute of Engineering 
and Applied Sciences were ranked 
in the top 200 Asian universities 
in 2014 (see go.nature.com/ 
kdwt8w). The King Abdullah 
University of Science and 
Technology in Saudi Arabia 
and the Masdar Institute in Abu 
Dhabi are rising stars. 

According to 2014 data on 
scientific publications, Iran ranks 
16th in the world, Turkey is 19th 
and Malaysia is 23rd — on a par 
with Switzerland, Taiwan and 
some Scandinavian countries, 
and ahead of South Africa (see 
go.nature.com/msé6fct). 

Furthermore, the requirements 
of the United Arab Emirates’ 
Commission of Academic 
Accreditation (CAA) are more 
stringent than those of the 
US Accreditation Board for 
Engineering and Technology 
(ABET), for instance. Whereas 


the CAA requires faculty 
members to have the highest 
degree in their field (such as 

a PhD), ABET requires only 
appropriate qualifications. The 
CAA also requires universities to 
have accredited PhD programmes 
in addition to accredited 
bachelor’s and master’s degrees. 
Javaid Laghari Pasadena, 
California, USA. 
jlaghari@gmail.com 


Microbiome studies 
need local leaders 


As researchers on the Brazilian 
Microbiome Project, we 
contend that creating a robust 
International Microbiome 
Initiative (IMI) needs local 
leadership rather than top- 
down scientific unification (see 
N. Dubilier et al. Nature 526, 
631-634; 2015). 

Microbial diversity 
and function are tied to 
geographically relevant features, 
so local investigation of these 
peculiarities is needed to 
underpin national biodiversity- 
protection measures. Researchers 
attached to such projects can 
boost their country’s reputation 
in science and technology. If the 
IMI succumbs to pressure to 
avoid local research consortia, 
it could bias scientific priorities 
and project management 
towards the interests of a few, and 
compromise the independent 
verifiability of the science. 

Resources expended on global 
collaborations without a clear 
description of aims could also 
result in an endless development 
of standards and protocols (see 
Nature http://doi.org/9gx; 2015). 
In our view, it is important to unite 
researchers locally to discuss such 
issues before imposing a 
pre-established model. 
Victor S. Pylro, Daniel K. 
Morais René Rachou Research 
Center (CPqRR-FIOCRUZ), Belo 
Horizonte, Minas Gerais, Brazil. 
Luiz F. W. Roesch Federal 
University of Pampa, Sao Gabriel, 
Rio Grande do Sul, Brazil. 
victor.pylro@brmicrobiome.org 
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OBITUARY 


Lisa Jardine 


(1944-2015) 


Historian of science who chaired pioneering embryology regulator. 


isa Anne Jardine rewrote the 
Ls of European intellec- 

tual and scientific life in the 
sixteenth and seventeenth centuries. 
She was always most interested in the 
precise ways in which her research 
subjects did their jobs. She showed 
how the sixteenth-century theologian 
Desiderius Erasmus invented a new 
kind of career as a scholar and writer in 
the world of print; how Robert Hooke 
and Christopher Wren transformed the 
city of London's landscape in the seven- 
teenth century; and how the Huyghens 
family and their Dutch compatriots 
created a sparkling world of exotic 
gardens, spectacular works of art and 
penetrating inquiries into nature. 

She told these stories in huge biogra- 
phies and histories that were as acces- 
sible and elegant in style as they were 
novel in content. These included The 
Curious Life of Robert Hooke (Harper- 
Collins, 2003) and Going Dutch (Harper, 
2008). Latterly, she steered the United King- 
dom’s pioneering regulator, the Human 
Fertilisation and Embryology Authority 
(HFEA) safely through choppy waters. 

Jardine, who died of cancer on 25 October 
2015, was professor of Renaissance stud- 
ies at University College London. Born on 
12 April 1944 in Oxford, Jardine was in some 
ways fated to study and write about science 
and the humanities. Her father, mathemati- 
cian and biologist Jacob Bronowski, created 
the landmark 1973 BBC documentary series, 
The Ascent Of Man. Like him, Jardine read 
mathematics at the University of Cambridge, 
later switching to study English. She then did 
an master’s in translation at the University of 
Essex and a doctorate in Renaissance studies 
at Cambridge. 

She became fascinated with what might 
be called the history of knowledge and the 
practices by which humans attain it. She 
admired dominant historical figures of the 
Royal Society in London, such as Hooke and 
Wren, and showed how they found new ways 
to unpick the fabric of nature. In 1996 she 
devoted a pioneering book on the Renais- 
sance, Worldly Goods (Macmillan), to the 
merchants and customers of the fifteenth 
and sixteenth centuries who learned to 
appreciate fine paintings, sumptuous fabrics 
and rare objects. The Renaissance itself, in 
her view, emerged from their finely honed 
consumerism. 


As Jardine’s interests developed, she 
invented new ways of writing history. Scholars 
knew for centuries that the sixteenth-century 
writer Gabriel Harvey adorned his books with 
vast marginal notes. Jardine deciphered them 
—and identified Harvey as a figure ofa previ- 
ously unknown kind, a Renaissance political 
adviser who served great men by reading the 
classics with them. Historians of science, in 
the 1980s and after, concentrated on recon- 
structing the precise, local practices of men 
such as Hooke and Robert Boyle. Jardine 
did the same: but she never forgot that they 
were polyglots, in dialogue with others across 
Europe. English science in its early heyday, as 
she portrayed it, was not a creation of national 
genius but a structure raised on foundations 
laid by the Dutch. 

In the 1970s and 1980s, when women 
were still rare in academia, Jardine became 
a mentor and model for a great many 
younger scholars, both male and female. 
Her greatest talent — and greatest love — 
was teaching. She lectured to prodigies 
and ordinary students with equal engage- 
ment, mentored brilliant scholars with 
immense generosity, and always found a 
way to look after one more student than 
the budget allowed. Even more than her 
books, her students are her monument. 

Committed to public service, Jardine held 
many high-profile posts. She judged the Man 
Booker and Whitbread literary prizes, and 
served as a trustee for London’s Victoria 
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and Albert Museum and a council 
member of the Royal Institution. In her 
two terms as chair of the HFEA, from 
2008 to 2014, she led efforts to reduce 
multiple births resulting from in vitro 
fertilization (IVE), provide fairer com- 
pensation for donors, reduce regulatory 
overlap and give better access to data 
for researchers — all this at a time when 
the future of the globally respected 
organization was uncertain. 

She felt particularly honoured to 
have overseen the 2012 public consul- 
tation on mitochondrial replacement. 
This IVF technique aims to prevent 
women from passing on harmful muta- 
tions in the cell’s energy-producing 
structures, mitochondria, by using a 
third party to provide healthy mito- 
chondrial DNA for a future baby. The 
United Kingdom this year became the 
first country in the world to allow this 
technique in the clinic, and the HFEASs 
engagement exercise is frequently cited in 
current debates on genome editing in sperm, 
eggs and embryos. 

What mattered most to Jardine was not the 
institution she served, but the quality of her 
service. Of her many distinctions, election to 
the Royal Society as an honorary fellow par- 
ticularly delighted her. She was just as proud 
of her stint as governor of a London school. 

Her favourite brooch read multum in parvo 
(‘alot ina little’) — her joke about her height. 
She was a commanding presence in public, 
a dazzling speaker whose plenary lectures 
were the most memorable events at many 
conferences. A wider British public knew her 
from many years of broadcasts, including the 
2013 BBC radio series Seven Ages of Science, 
in which she vividly conveyed the excitement 
and complexity of historical research. Her 
voice on the page was distinctive: trenchant, 
accurate and unfailingly eloquent, whether 
she was arguing a historical case in a journal 
or engaging in a contemporary debate ina 
newspaper or online article. 

Lisa was a rare figure — she combined 
academic brilliance with a deep commit- 
ment to public service, and made it all look 
SO easy. @ 


Anthony Grafton is professor of history 
at Princeton University in Princeton, New 
Jersey, USA. He collaborated extensively 
with Lisa Jardine. 

e-mail: grafton@princeton.edu 


LISA JARDINE 


NEWS & VIEWS 


For News & Views online, go to 
nature.com/newsandviews 


NUCLEAR PHYSICS 


Close encounters of the alpha kind 


Breakthrough calculations of collisions between two helium nuclei pave the way to a quantitative understanding of how 
the elements carbon and oxygen were made in stars — and to improved models of stellar evolution. SEE LETTER P.111 


SOFIA QUAGLIONI 


he life-enabling elements carbon 

and oxygen were mainly made in red 

giant and supergiant stars, through 
a sequence of fusion reactions known as 
helium burning. In this process, helium nuclei 
(‘He) — aggregates of two protons and two 
neutrons, named alpha particles by Ernest 
Rutherford' — are progressively converted into 
carbon and oxygen nuclei (‘°C and '°O, respec- 
tively). But exactly how each of these reactions 
happens at a fundamental level remains unex- 
plained. On page 111 of this issue, Elhatisari 
et al.” take a crucial step towards addressing 
this question by describing the collision of two 
alpha particles (alpha—alpha scattering) from 
first principles. The theoretical-computational 
methods described in this work could also be 
used to characterize collisions between certain 
other composite quantum particles. 

Helium burning starts when two alpha 
particles collide with each other and attempt 
to fuse. But the product of that fusion is an 
unstable beryllium isotope, 8Be, which decays 
almost instantly back into two helium nuclei. 
A third alpha particle therefore has to be cap- 
tured nearly simultaneously with the collision 
of the original pair for *C to be formed. This 
process is known as the triple-alpha reaction’ 
and was first proposed in 1952. Oxygen is 
then created when °C captures a fourth alpha 
particle* (Fig. 1). The ratio of the amount of 
carbon to that of oxygen produced through 
these processes has profound repercussions 
on the later evolutionary phases of magnifi- 
cent massive stars such as Orion’s Betelgeuse, 
and their ultimate fate once they explode as 
supernovae’. 

The odds of any of these reactions taking 
place are small at the low energies encountered 
in stellar environments, largely because the 
positively charged colliding nuclei electrically 
repel each other. The low reaction rates make 
helium-burning reactions difficult to replicate 
and measure in a laboratory. This prevents the 
carbon-to-oxygen ratio produced in stars from 
being accurately estimated, and introduces 
large uncertainties in stellar-evolution mod- 
els and simulations of the processes that create 
nuclei (nucleosynthetic processes). A compu- 
tational approach capable of describing helium 
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Figure 1 | Formation of carbon and oxygen nuclei. Carbon nuclei, '’C, form in stars through a sequence 
called the triple-alpha process. Two alpha particles (which consist of two protons and two neutrons) 
collide to produce a transient beryllium nucleus, *Be, and a gamma ray. If another alpha particle then 
collides with the *Be nucleus before the nucleus falls apart, a '"C nucleus and another gamma ray can be 
generated. The ’C nucleus can then collide with a fourth alpha particle to form an oxygen nucleus, '*O. 
Elhatisari et al.’ report first-principles simulations of collisions between two alpha particles, a first step 
towards numerical modelling of ’C and '°O formation. 


burning from first principles could come to 
the rescue by providing ‘measurements’ from 
simulations. Of course, before such theoreti- 
cal predictions can be trusted, their accuracy 
in reproducing experimentally determined 
data must be ascertained. The scattering of 
two alpha particles has been characterized 
experimentally, and is a good starting point 
in this case. 

Why did we have to wait until 2015 to obtain 
a first-principles description of alpha—alpha 
scattering? The simple answer is that develop- 
ing a fundamental understanding of nuclei and 
their interactions is one of the most compli- 
cated problems in science. It involves unravel- 
ling the properties of an ensemble of nucleons 
(protons or neutrons), exerting forces on each 
other, that emerge from the underlying theory 
of strong interactions, while also accounting 
for all the quantum-mechanical laws that gov- 
ern microscopic objects. The complexity of the 
numerical calculations needed explodes as the 
number of nucleons increases. Explaining the 
dynamic interactions of clusters of nucleons 
is especially hard. Even the scattering of deu- 
terium (a two-nucleon system) from alpha 
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particles has only recently been described from 
first principles®. 

Elhatisari and co-authors report a clever way 
to break through this computational ceiling. 
They start with a supercomputer-friendly 
formulation — ona four-dimensional space- 
time lattice — of chiral effective field theory. 
This theory” links the interactions of nucleons 
to quantum chromodynamics (the underlying 
theory of the strong force) by describing them 
as the sum of an infinite number of terms, 
systematically organized in order of impor- 
tance so that all but the first few terms can be 
neglected. 

The authors use their formulation to follow 
the evolution of the wavefunction of mutually 
interacting nucleons in a pair of alpha par- 
ticles, although this is easier said than done. 
It requires a trick””’ developed in the 1950s 
and an aptitude for ‘gambling’: the problem 
is reformulated as a much simpler system 
of independent nucleons interacting with a 
background of auxiliary particles; the most- 
likely backgrounds are drawn from a prob- 
ability distribution. This technique, known 
as auxiliary-field Monte Carlo, is then used 


to ‘cool the two alpha particles, which are 
initially placed at a distance from each other, 
to their correct physical state — that is, to a 
low-energy quantum-mechanical solution of 
two dynamically interacting nucleon clusters. 
This process is repeated over and over, each 
time with a new relative position for the ini- 
tial pair of alpha particles, again drawn from 
an appropriate probability distribution, and 
the resulting physical states are then used to 
compute the effective interaction experienced 
by the two particles as a whole. Voila! The 
ferocious eight-body problem is transformed 
into a docile two-cluster problem. 

After 2 million hours of parallel computa- 
tions, the alpha-alpha scattering properties 
obtained by the authors — including the first 
three interaction terms of the sequence pro- 
vided by chiral effective field theory — show 
promising agreement with experimentally 
obtained values. The calculation is admit- 
tedly somewhat shy of the accuracy required 
to make quantitative predictions for nucleo- 
synthesis and stellar evolution. Improvements 
should be made by including the next term in 
the sequence, investigating the dependence 
of the results on the spacing of the space- 
time lattice, and doing other precision tests. 
Extensions to enable the treatment of three- 
cluster dynamics” are also required before 
the method can be applied to the triple-alpha 
process. 

Most impressively, Elhatisari et al. have 
devised a first-principles method for simulat- 
ing scattering and reactions in which the num- 
ber of computing operations is proportional 
to the square of the number of nucleons, and 
therefore grows relatively slowly. Scattering 
between alpha particles and "°C, and the con- 
version of these particles to '*O — a problem 
that is only four times as difficult as alpha- 
alpha scattering using the authors’ approach 
— are now within reach of simulations. Fur- 
thermore, analogous methods could be used 
to solve other puzzles. For example, predictive 
calculations of hyperon-neutron scattering 
could help to settle whether or not ‘strange’ 
particles can exist in the cores of neutron 
stars'’, thus providing insight into the phases 
of dense nuclear matter that can exist. But that 
is another story. = 


Sofia Quaglioni is in the Nuclear and 
Chemical Sciences Division, Physical and Life 
Sciences Directorate, Lawrence Livermore 
National Laboratory, Livermore, 

California 94551-0808, USA. 

e-mail: quaglioni1@llnl.gov 
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Ecosystem vulnerability 
to ocean warming 


Analysis of the temperature ranges occupied by marine species finds that the 
vulnerability of ecological communities to global warming may depend more on 
organismal physiology than on the magnitude of change. SEE ARTICLE P.88 


DEREK P. TITTENSOR 


uman communities have already 
Hee: to develop and execute plans 

for adapting to climate change’. 
Ecological communities are equally vulner- 
able, and human intervention is required to 
alleviate pressures and minimize the risks of 
biodiversity loss and species extinction. Much 
of our attention so far has focused on predict- 
ing the responses of individual species. But is 
it possible to anticipate how entire ecosystems 
will respond and reconfigure as the land and 
oceans warm? In this issue, Stuart-Smith 
et al.” (page 88) construct a metric of commu- 
nity vulnerability for marine environments 
that is based on the physiology of individual 
species as well as on external environmental 
conditions. Their findings challenge previ- 
ous assumptions that the magnitude or rate 


Mean of individuals’ 


of warming is the best predictor of ecological 
change. 

All species have a thermal niche — the 
temperature range in which they can survive. 
But, in reality, most do not occupy all sites 
within this range because other constraints, 
such as competition and food availability, limit 
them further. The range of temperatures over 
which a species actually lives is its ‘realized’ 
thermal niche. Stuart-Smith et al. use two large 
species-occurrence databases to construct 
realized thermal niches for almost 4,000 reef 
fish and marine macroinvertebrate species, by 
comparing observations of the animals’ occur- 
rence with data on sea surface temperature at 
those locations. 

Each realized thermal niche has a midpoint, 
and the mean of these midpoints for all indi- 
viduals in an ecological community is called 
the community thermal index (CTI; Fig. 1). 
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Figure 1 | Community thermal indices. Each species has a realized thermal niche — the range of 
temperatures over which it can live in a given community. The community thermal index (CTI) is the 
mean midpoint of these thermal niches for all individuals in a community. Stuart-Smith et al.” have 


extended this metric to the CTI, 


max? 


which is a measure of the mean of the upper end (95th percentile) 
of the realized thermal limits of the species in the community. The CT] 


allows calculation ofa 


max 


vulnerability metric — the proportion of species in a community that has an upper thermal limit 
lower than a given summer sea surface temperature. In this example, if the future summer sea surface 
temperature is 25°C, the vulnerability metric will be 0.33, because one of the three species in the 
community (species 3) has an upper thermal limit below that temperature. 
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Although the CTI is not a new concept (see 
ref. 3, for example), Stuart-Smith et al. calcu- 
late it across a global range of marine commu- 
nities. They then compare it with observed 
temperatures to determine a ‘thermal bias — 
the discrepancy between the CTI and the mean 
annual sea surface temperature. This indicates 
whether a community is weighted towards 
species adapted to warmer environments 
(a positive bias) or to cooler ones. 

The authors find that most communities 
are associated with a thermal bias. This is per- 
haps unsurprising, but Stuart-Smith et al. also 
detect an intriguing large-scale biogeographical 
pattern, with the thermal bias not randomly 
scattered around ocean temperature (see Fig. 1 
of the paper’). Moving from CTIs to examin- 
ing the thermal-niche midpoints that under- 
pin them, the authors find that most species are 
associated with either a temperate or tropical 
midpoint, with noticeably fewer species having 
midpoints at subtropical temperatures (there is 
also a third group of invertebrates that have a 
subpolar midpoint). Such faunal clustering is 
responsible for the nonlinear distribution in 
thermal bias observed in these marine commu- 
nities. This pattern awaits testing for consistency 
in other taxa, and the potential mechanisms that 
establish it require further exploration. The 
assumption that a species is ‘optimized’ for its 
thermal midpoint also needs further experi- 
mental or empirical verification, although the 
authors provide evidence to support it. 

Of course, a thermal bias in a community 
may not represent anything more than an 
assortment of species with wide thermal 
ranges, indicating low susceptibility to warm- 
ing. To move beyond an index of bias and 
towards an estimate of vulnerability, a meas- 
ure that accounts for the upper limit of each 
organism's temperature tolerance must instead 
be determined. Stuart-Smith et al. define the 
upper limit as the 95th percentile of its thermal 
distribution; thus, for a species with a 95th per- 
centile of 28 °C, 95% of individuals would be 
found below this temperature. The authors then 
define a measure for each community, called 
the CTI,,,.. as the mean of the 95th percentile of 
each species in the community (Fig. 1). 

Sites with a CTI,,,, close to the summer 
water temperature are likely to contain many 
species living perilously close to their ther- 
mal limits. The authors encapsulate this in a 
vulnerability metric, which is defined as the 
proportion of species at each site that have an 
upper thermal limit lower than the mean sum- 
mer water temperature. Projecting tempera- 
tures forward 100 years to 2115 using climate 
models from the Fifth Assessment Report 
of the Intergovernmental Panel on Climate 
Change (IPCC)’, they predict that one-third 
of surveyed ecoregions will have all species liv- 
ing at temperatures greater than their upper 
thermal limits — a stark indication that these 
individuals must move, adapt or perish. 

Will all individuals of those species in 


these ecoregions die? Not necessarily. Just as 
correlation does not imply causation, vul- 
nerability does not imply extinction. Many 
species may have greater plasticity or ability to 
respond to change than we anticipate. Stuart- 
Smith and colleagues’ study also does not take 
into account the complex interactions between 
species, perturbations of which may propagate 
in unpredictable and complex ways. Other 
potential biasing factors include the authors’ 
use of the IPCC’s most-extreme climate scen- 
ario (RCP8.5), and the fact that some other 
anthropogenic impacts will act synergistically. 
And as communities reorganize, species may 
move in as well as out and total species richness 
thus be unaffected — or even increased. 
Nonetheless, the CTI, takes into account 
organismal physiology rather than just levels 
or rates of environmental warming, and as such 
may move usa step closer towards understand- 
ing the effects of warming on entire assem- 
blages. Indeed, Stuart-Smith et al. find that 
the sites projected to lose the most species are 
those with a more negative thermal bias, rather 
than those with a high magnitude of warming. 
This suggests that using environmentally based 
metrics of warming (see ref. 5, for example), 
without taking species characteristics into 
account, may be insufficient for character- 
izing vulnerability. An obvious next step is to 
test Stuart-Smith and colleagues’ approach 
with other taxa to see whether similar patterns 
emerge. However, under-sampling of species 
ranges could give the impression of an artifi- 
cially narrow niche. The picture provided is 
only as good as the data underlying it, which 
may limit broader application of this metric. 
The response of ecological communities 
to climate change is undeniably more complex 


METABOLISM 


than any single value can reveal. Suites of 
metrics are used by international policymakers 
to track the status of biological communities 
in response to anthropogenic change®, and 
metrics that encompass biological traits may 
add valuable information. In conjunction with 
modelling efforts that incorporate species 
interactions (see, for example, refs 7 and 8), a 
scaffolding of understanding, or at least plau- 
sibility, can be constructed. Stuart-Smith et al. 
have contributed a tool that could help us to 
reach this goal. But in a world in which marine 
and terrestrial ecosystems face accelerating 
pressures’, our ability to respond, protect and 
sustain remains precarious. = 
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Inflammation Keeps 
old mice healthy 


Immune cells called regulatory T cells accumulate in fat during ageing. The 
anti-inflammatory activity of these cells worsens age-associated defects in 
metabolism, in contrast to its effect in obesity. SEE LETTER P.137 


IVAN MAILLARD & ALAN R. SALTIEL 


B ody fat undergoes extensive and frequent 


remodelling, as changes in blood-vessel 

development, connective tissue, the 
number and size of fat cells, and other features 
allow fat to store or release the energy that 
the organism needs. But this adaptive pro- 
cess can become harmful in stressful con- 
ditions. Maladaptive remodelling in obese 
rodents and humans is associated with larger 
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fat cells and chronic inflammation in fat, 
leading to insulin resistance, type 2 diabetes 
and cardiovascular complications’. Attempts to 
disentangle the various components that cause 
this dysregulation of metabolism have revealed 
a coordinated inflammatory circuit involv- 
ing interactions between several cell types’. 
On page 137 of this issue, Bapat et al.” adda 
new twist to the story, providing evidence 
that, contrary to what might be expected, 
age-associated metabolic changes may be 


regulated by a different mechanism from those 
associated with obesity. 

Specialized immune cells called regulatory 
T cells (T,,, cells) suppress the inflammatory 
immune responses driven by white blood cells. 
The importance of T,,.. cells for the function- 
ing of the immune system is highlighted by the 
fact that mammals that lack the transcription 
factor FOXP3 — which controls the develop- 
ment, maintenance and function of T,,, cells 
— develop multi-organ autoimmune disease. 
Although many different immune cells are 
present in fat, fat-resident T,,,, cells (fT... cells) 
have attracted attention as potential modula- 
tors of local inflammation’. 

The T,,, cells first enter fat immediately after 
birth, accumulate during ageing and acquire 
a molecular signature characterized by the 
expression of several factors>®, including the 
transcription factor PPAR-y, which controls 
differentiation, and the protein subunit ST2 
(a receptor for the protein IL-33), which 
has been implicated in the development of 
fT,., cells. Furthermore, some T-cell-receptor 
proteins (which recognize structures called 
antigens during an immune response) are 
preferentially expressed in fT,,,, cells over other 
T,.. Cells, suggesting that certain fT,,, popu- 
lations proliferate in response to antigens” °. 
Although much remains to be learnt, data from 
animal models of obesity suggest that fTT,,., cells 
protect against inflammation and metabolic 
dysfunction***. 

Bapat et al. analysed fT,,, cells in ageing 
mice, rather than in models of obesity, and 
turned the tables on previous assumptions 
about these cells. Understanding ageing- 
associated metabolic dysregulation is vital, 
because age is a major risk factor for insulin 
resistance and diabetes. The authors observed a 
striking age-related accumulation of fT,,, cells. 
But unexpectedly, when the authors depleted 
the fT,,, population by deleting PPAR-y in 
these cells, they found that the ageing mutant 
mice gained less weight than their wild-type 
counterparts — they accumulated less body fat 
and more lean weight, ate more and burnt more 
calories than age-matched controls. 

Every metabolic parameter analysed by 
the authors was better in these mice. Fast- 
ing glucose and insulin levels decreased, as 
did insulin resistance. By contrast, pharma- 
cological expansion of the fT,., population 
increased levels of insulin resistance and other 
parameters of metabolic dysregulation. 

The researchers found that fT,,, depletion 
was associated with increased local levels of the 
pro-inflammatory signalling molecule TNF-a, 
consistent with increased fat inflammation. 
Depletion was also correlated with a reduction 
in fat-cell size and with reduced expression of 
collagen genes — evidence of improved meta- 
bolic activity and beneficial fat remodelling. 
Ageing fT,,, cells maintained their molecular 
signature and their ability to suppress immune 
responses, indicating that their function had 


a Young adult 


Ageing 


UPS Fat cell 


Decreased fat 
remodelling 
Metabolic dysregulation 


NEWS & VIEWS | RESEARCH | 


Old adult b 


filiree 
depletion 


Restored fat 


remodelling 
Metabolic regulation 


Figure 1 | Regulatory T cells impair metabolic regulation in old fat. a, During ageing, immune cells 
called fat-resident regulatory T (fT,,.) cells accumulate in body fat and decrease local inflammation (red 
colouring). This age-associated accumulation correlates with increases in metabolic dysregulation and 
in the size of the fat cells, and with a decrease in the ability of fat to remodel — the process by which fat 
undergoes morphological alterations in response to changing nutrient demands. b, Bapat et al.” depleted 
the fT, population in ageing mice. This increased inflammation, reduced the size of fat cells 

and improved fat remodelling, thereby improving metabolic regulation. 


not been subverted. Together, these data dem- 
onstrate that fT,.. accumulation plays a part in 
age-associated metabolic dysregulation, and 
that at least some aspects of the inflammatory 
response suppressed by fT,,, cells are favour- 
able in ageing (Fig. 1). 

These surprising results contrast with 
previous observations, which indicated that 
fT,., numbers decrease in obese mice, and 
that fT,., expansion improves metabolic 
health without affecting weight***. Although 
Bapat et al. focused on ageing, some of their 
results are notably divergent from previous 
data. For instance, the authors found no evi- 
dence that fT, cells modulate metabolism 
in a high-fat-diet model of obesity (although 
there were few fT,,, cells in the fat, and the 
consequences of expanding the fT,., popula- 
tion were not tested in this model). Directly 
contradicting previous data’, they found that 
stimulating PPAR-y activity with the anti- 
diabetic drug rosiglitazone exerted beneficial 
metabolic effects in obese mice, even those in 
which PPAR-y was deleted in fT,,,, cells, sug- 
gesting that other cell types are crucial targets 
for this drug. 

The reasons for these discrepancies remain 
to be investigated. Perhaps they represent 
differences in experimental design, or in the 
populations of commensal bacteria found in 
mice used at different institutions. It will also 
be essential to evaluate whether the effects of 
fT,,,. cells observed in mice apply to humans. 
A complicating factor is that ageing- and 
obesity-associated metabolic dysregulation 
often coexist in humans. 

Nonetheless, information is accumulating 
about fT, cells and inflammation. First, other 
laboratories have also detected age-dependent 
Tg accumulation in fat®°, although the meta- 
bolic impact of fT,,, cells in ageing has not 
previously been measured. Second, a refined 


understanding of the cells’ gene-expression 
profile is emerging, providing hints about their 
molecular regulation and tools that could be 
used to manipulate their numbers and func- 
tion***. Third, compelling evidence*? sup- 
ports the existence of antigens in fat that drive 
expansion off T,,, populations; these are prob- 
ably presented to fT,,,, cells in association with 
the antigen-presenting protein complex, major 
histocompatibility complex class II. It will be 
crucial to identify these antigens and other 
factors that contribute to fT,,, accumulation 
during ageing. 

Is there crosstalk between fT,,., cells and 
other immune cells, such as innate lymphoid 
cells (ILCs)? Recent reports'®'! have shown 
that ILCs infiltrate fat, and have indicated that 
IL-33 activates ILC2s to induce white fat to 
become heat-producing beige fat. It is possi- 
ble that IL-33 has two opposing roles: inducing 
energy expenditure through ILC2s, but ensur- 
ing energy conservation through expansion of 
the fT,,, population and by inducing the secre- 
tion of molecules such as IL-10 that promote 
anabolic activity (which increases energy stor- 
age in fat)'*. Thus, fT,,, depletion could tip the 
balance in favour of energy expenditure, caus- 
ing weight loss and improved metabolic health. 
However, so far there have been no studies on 
IL-33 or ILC2s in ageing mice. 

Finally, blocking inflammatory pathways 
in ageing fat cells has been reported to impair 
metabolic regulation”, in contrast to the pre- 
vailing view but consistent with Bapat and col- 
leagues’ data. Identifying beneficial elements of 
the inflammatory response, and investigating 
their metabolic effects, will be essential, and 
may point to evolutionarily conserved features 
of inflammation that lead to tissue adaptation 
under abnormal conditions. In fat and else- 
where, different facets of inflammation might 
have many roles, both good and bad. = 
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A century of 
phage lessons 


One hundred years after the first description of viruses that infect bacterial cells, 
the contribution of these bacteriophages to fundamental biology, biotechnology 
and human health continues unabated and deserves celebration. 


FOREST ROHWER & ANCA M. SEGALL 


published the first report of viruses that 

infect bacteria, replicate there and kill 
the cells. Since then, studies of these viruses, 
known as bacteriophages, or more colloquially 
as phages, have transformed biology. Phages 
provided the experimental systems and tools 
for the molecular-biology revolution of the 
twentieth century, and their rapid growth rates 
have allowed fundamental principles of ecol- 
ogy and evolution to be tested. We now know 
that phages are the world’s most successful 
biological entities, being more abundant and 
genetically diverse than any other life form. 
Despite their importance, the study of these 
fascinating entities remains a niche endeavour. 
Here, we briefly review the history of phage 
studies, with the hope of inspiring a new 
generation of phage scientists. 

In the early 1900s, most phage scientists 
were interested in using the viruses as anti- 
bacterial agents. This was an era of uncon- 
trolled scientific trials, in which people were 
injected with phages or the viruses were 
poured into water wells with the hope of killing 
pathogenic bacteria, such as those that cause 
cholera. This line of research dramatically 
decreased with Alexander Fleming’s discov- 
ery of antibiotics in 1928. But the concept of 
‘phage therapy’ is currently resurging as anti- 
biotic resistance becomes more of a concern. 

Phage science entered the quantitative 
realm when a network of biologists, bio- 
chemists and physicists, known as the Phage 
Group, used these viruses as models for their 
pioneering studies of how life works. In 1952, 
Alfred Hershey and Martha Chase’ performed 
a famous experiment in which radiolabelled 


I n 1915, bacteriologist Frederick Twort’ 


phages were sheared off bacterial cells using 
a high-speed blender, helping the research- 
ers to establish that DNA is the genetic 
material. The discovery of phage-encoded, 
DNA-manipulating enzymes — such as DNA 
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and RNA polymerases, ligases and endo- and 
exonucleases — literally catalysed the rise 
of molecular biology and the biotechnol- 
ogy industry, and phage proteins are now 
used every day all over the world. Restriction 
enzymes that protect bacteria from phage 
infection provided another indispensable tool 
for molecular biologists. This trend continues 
today, as can be seen from the revolution in 
genome editing that has arisen following the 
discovery of the CRISPR-Cas system, used by 
bacteria as a defence against phages. 

As the genetic code was revealed in the 
mid-1900s, the sequencing of a complete 
genome became a major research goal. Phages 
were attractive targets because of their small 
genome size and the possibility of mak- 
ing large amounts of DNA for sequencing. 
Frederick Sanger and colleagues’ sequenced 
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Figure 1 | Bacteriophage in action. Bacterium-infecting viruses were first described’ in 1915, 

but it was only in 1940 that the first electron micrographs (a) of bacteriophages (arrow) infecting 
bacteria were published’. These early images helped to confirm that the effects attributed to phages 
were indeed caused by viruses, and not by enzymatic activity. Modern electron microscopy (b, c) 
has produced images of phages that reveal details of phage structure and infection processes (see, for 


example, ref. 16). 
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the complete genome of the phage ®X174 
in 1977, decades before any cellular genome 
was completed. As additional phage genomes 
accumulated, it became apparent that phages 
exchange genes and large sections of DNA 
between individuals’. This discovery of hori- 
zontal gene transfer changed our understand- 
ing of how genetic variability is produced. 
Marine phage communities were the first to 
be ‘shotgun sequenced; leading to the rise of 
metagenomics — the sequencing en masse of 
all members of a community’. 

Understanding phages has contributed to 
our fundamental understanding of host cells 
and disease (Fig. 1). When phages integrate 
into bacterial genomes, they can dramatically 
change the characteristics of their bacterial 
hosts — many of the most deadly bacterial 
pathogens, including Vibrio cholerae and 
Shigella and Salmonella species, acquire 
virulence factors through this mechanism. 
Dissecting the biology of phage replication also 
uncovered several key host-encoded factors 
that are needed for the phage life cycle, suchas 
the enzyme DNA gyrase® and the ‘chaperone’ 
protein complexes GroEL and GroES’. 

When the ‘war on cancer’ was declared by 
then US President Richard Nixon in 1971, 
phage biologists were actively recruited into 
research on human biology. Building on the 
knowledge that phages encode some proteins 
that are similar to those of the host, these 
scientists looked in the human genome for 
analogous genes from other viruses. Not only 
did they find such genes, but they also devel- 
oped the idea of ‘proto-oncogenes’ present 
in our genome that, when mutated, are key 
drivers of cancer. 

Other phage researchers moved into the 
fields of DNA mutagenesis, repair and recom- 
bination, providing the basis for our under- 
standing of cancer today. For example, the 
understanding® that pre-existing mutations 
can give individual cells growth advantages 
under different environmental conditions led 
to the idea that cancer cells harbour dozens of 
pre-existing mutations that may or may not be 
related to the actual tumour’. With the advent 
of the AIDS epidemic, phage researchers 
opened the door to our understanding of how 
retroviruses integrate into the human genome, 
and what host proteins are involved”. 

The downside of phage scientists moving 
into different arenas was a massive decline in 
phage research from the 1970s onwards. Given 
that phages are such great anvils for the ham- 
mers of biologists, why do many researchers 
pay them so little attention? One reason might 
be that, as frequently occurs in any old disci- 
pline, the literature is dense and filled with 
acronyms and a changing nomenclature. To 
help counter this, we provide some guiding 
principles on phages. 

A first key point is the contribution of phages 
to biological diversity. There are probably more 
than 10°' phage particles on the planet, with 


approximately 10 phage particles for every 
bacterial cell''. In humans, the main genetic 
difference between two individuals is the 
phages in their gut’”. Among other roles, these 
viruses form an adaptable immune system 
that makes use of hypervariable, immuno- 
globulin-like protein domains similar to those 
used by antibodies”. 

The second concept is that phages carry 
genes encoding proteins that modulate the 
fundamental physiology of the host, such as 
metabolism and antibiotic resistance. One 
fascinating example occurs in photosynthesis 
by oceanic cyanobacteria’. The components 
of the light-gathering antenna complexes 
produced by these bacteria are highly labile 
and decay during phage infection. But the 
phages can carry genes that encode replace- 
ment of the damaged 
proteins, allowing 


Whenp hages the bacteria to con- 
integrate — tinue to produce bio- 
into bacterial mass and the phages 
genomes, they to produce larger 
can dramatically bursts of progeny. 
change the Thus, these marine 
characteristics phages contribute to 
of their bacterial the vast turnover of 
hosts. carbon in the oceans 


by increasing the effi- 
ciency and output of photosynthetic processes. 
A third phage lesson is that the niche space 
of any bacterial cell is determined by its phages. 
The main genomic differences between closely 
related bacteria derive from integrated phages 
(prophages) and genomic features, ranging 
from indels to major rearrangements, that help 
to guard against phage infection. This never- 
ending selective pressure exerted on bacteria 
by their phages is the best-characterized exam- 
ple of the Red Queen hypothesis — that preda- 
tor and prey species must constantly evolve. 
What will the phage future look like? These 
viruses are relatively easy to synthesize, and 
their genomes have modular characteristics 
that appeal to synthetic biologists for engineer- 
ing biological functions. One hundred years 
after their discovery, we think that it is time 
for our fellow biologists to throw off their cell- 
centric habits and embrace the phage. = 
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50 Years Ago 


The Architecture of Molecules. By 
Prof. Linus Pauling and Roger 
Hayward — “Weare now living 

in an atomic age. In order to 
understand the world, every person 
needs to have some knowledge of 
atoms and molecules.’ This is the 
beginning paragraph ofa fascinating 
work of art ... The question as to 
how some understanding of science, 
however superficial, can be brought 
to the man and woman in the street 
has exercised many organizations 

as well as individuals. At a practical 
level, of course, it is unnecessary 

to know anything about electric 
currents in order to turn a switch 
and bring on the light. Babies love 

to do it before they are one year 

old. But for all too many people 
science is still magic even when they 
are twenty-one ... Linus Pauling 
worries. He thinks, quite rightly, that 
young people ought to want to know 
why the ‘lead’ ofa pencil comes off 
on to the paper, what an atom of 
hydrogen or uranium ‘looks like’... 
Undoubtedly many an arts sixth- 
former will pick the book off the 
school library shelf and will learn a 
great deal by browsing through it. 
From Nature 4 December 1965 


100 Years Ago 


The nation’s attitude towards science 
is, I think, largely due to the popular 
idea that science is a kind of hobby 
followed by a certain class of people, 
instead of the materialisation of 

the desire experienced in various 
degrees by every thinking person to 
learn something about innumerable 
natural phenomena still unsolved; 
and, having learned, to control 

and apply them intelligently for 

the benefit of the human race ... 

It is to the new generation now 
being educated that we must look 
for betterment of our position ... 
We must make all education more 
scientific. 

From Nature 2 December 1915 
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Getting the measure 
of entanglement 


A property called entanglement entropy helps to describe the quantum states of 
interacting particles, and it has at last been measured. The findings open the door 
to a deeper understanding of quantum systems. SEE ARTICLE P.77 


STEVEN ROLSTON 


puzzling aspect of quantum mechanics 

is entanglement: the idea that the 
combined state of two particles can 

be completely specified, but that the state of 
each entangled particle is completely random 
when measured alone. Entanglement entropy 
marries the concept of entanglement with 
that of entropy — the degree of randomness 
of a system — and has become a useful theo- 
retical tool with which to characterize many- 
body states in condensed-matter physics. On 
page 77 of this issue, Islam et al.' report the first 
experimental measurement of entanglement 
entropy in a small system of atoms trapped in 
a lattice of light, a model of a solid-state system. 
Quantum mechanics, the theory of the 
microscopic world, has many features that run 
counter to our everyday experiences in a clas- 
sical world. The possibility of entanglement 
in a quantum system of two or more particles 
has been a challenging and stimulating idea for 
many years. Einstein and his colleagues were 


famously bothered by the idea that measuring 
one particle of an entangled pair seemingly 
instantaneously determined the state of its 
partner — “spooky action at a distance’, as they 
put it’. But the existence of entanglement was 
made concrete through the theoretical work 
of the physicist John Bell’, and experimental 
tests of Bell’s inequalities (constraints derived 
from Bell’s work) have unambiguously veri- 
fied the quantum-mechanical description of 
the microscopic world (see ref. 4, for example). 

Although an understanding of two-particle 
entanglement is quite well in hand, there is no 
specific measure of the amount of entangle- 
ment in three or more particles. Yet entan- 
glement has become an important tool for 
understanding the states of many-body sys- 
tems. When many particles interact with one 
another, even through simple interactions, the 
low-energy quantum states can be surpris- 
ingly complicated, with lots of entanglement. 
Entanglement entropy has become a favoured 
theoretical measure for categorizing such 
complex states. 


Figure 1 | Probing entropy entanglement in optical lattices. Islam et al.' report the first experimental 
measurements of entanglement entropy, a quantity used in theoretical studies to characterize many-body 
states. a, The authors set up two identical systems of four entangled atoms (dots; double-headed arrows 
indicate horizontal tunnelling), trapped in the potential-energy wells of an optical lattice (an array of 
interfering laser beams; red lines indicate the optical confining fields). b, The confining fields were then 
adjusted to allow the two four-atom systems to tunnel vertically into one another. The resulting number 
of atoms in each lattice site contained a signature of each system's state, from which the entanglement 


entropy can be extracted. 
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To understand what many-particle entan- 
glement means, let’s start by considering a 
non-entangled system. If I create a system 
that has N particles, each in an identical state 
independent of their N- 1 neighbours, then its 
many-body description is simple, and meas- 
uring one particle or partitioning the sample 
has little impact on the overall system. Not that 
such states are uninteresting — this is a good 
description ofa state of matter called a Bose- 
Einstein condensate, for example. Similarly, if 
each particle is in its own different state, with 
no relationship to its neighbours, then meas- 
urement or partitioning has no global effect. 

But if the particles are entangled with one 
another, either pairwise or in a more complex 
fashion, then measurement of one particle 
affects the state of other particles. Entangle- 
ment entropy measures the increase in 
entropy (which can be thought of as increased 
randomness) that occurs if we partition such 
a system’. Identifying emergent, complex, 
lowest-energy states of seemingly simple sys- 
tems of interacting particles is a particularly 
challenging task, for which entanglement 
entropy can be used to understand the nature 
of the state and to probe its ‘quantumness. 

Until now, entanglement entropy has been 
a purely theoretical construct in condensed- 
matter physics, because it is difficult to par- 
tition a solid-state system and measure its 
constituents. Islam et al. have performed the 
first such measurements using two identical 
copies of a small system of four atoms trapped 
in an optical lattice (an array of interfering laser 
beams). If the potential-energy ‘landscape’ of 
the optical lattice is not too deep, the particles 
can tunnel from one site to the next and feel 
the presence of their neighbours. This leads 
to a many-body state that exhibits entangle- 
ment. But if the lattice is deep, the particles act 
as individuals, and are free of entanglement. 

The authors performed their experiment 
in a quantum gas microscope’, in which a 
single layer of an optical lattice is generated 
just below a high-resolution optical micro- 
scope. When Islam et al. relaxed some of 
the optical confining fields, the two copies 
of the four-atom systems could tunnel into 
one another and, through quantum interfer- 
ence (the Hong—Ou-Mandel effect’), leave 
a signature of their state in the number of 
atoms in each lattice site (Fig. 1). The authors 
simply counted the atoms using the micro- 
scope and extracted the entanglement 
entropy (the second-order Rényi entangle- 
ment entropy’, for those in the know) from 
the number of atoms. In this way, they show 
that their four-atom system can have less 
entropy as a whole than when it is partitioned, 


something that is not possible without 
entanglement, nor in any classical system. 

As the first measurement of its kind, this is 
a milestone. But as with any first, the experi- 
mental techniques involved have been pushed 
to their limits. The ‘many-body’ systems there- 
fore consist of only four particles, primarily 
limited by how well the two copies interfere. 
With improvements, it should be possible 
to study larger numbers of atoms and more- 
interesting interacting systems. An intriguing 
possibility would be to measure higher-order 
entanglement — the current experiment meas- 
ures second-order entanglement, whereas 
nth-order entanglement would require n inter- 
fering copies. This would give further access 
to the entanglement spectrum, which yields 
complete knowledge of the quantum state of 
a system. 

Understanding the way in which complex 
many-body states appear and evolve in systems 
out of equilibrium is a hot topic in condensed- 
matter physics, because much of our world is 
not in equilibrium. This is an especially inter- 
esting question in closed systems for which 
there is no means of driving the system to a 
thermal equilibrium. Entanglement entropy 
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will be a crucial tool for understanding non- 
equilibrium systems, and Islam and colleagues’ 
experimental approach is easily adaptable to 
such studies. The authors’ proof-of-principle 
experiment also opens the door to a greater 
understanding of the role of entanglement in 
complex many-body systems through direct 
experimental observations. Given that both 
entanglement and entropy are sometimes 
perplexing concepts, the ability to acquire 
tangible information about them in the labora- 
tory will certainly benefit their study. m 
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Tumour cells on 
neighbourhood watch 


The discovery of microtube structures that link tumour cells in some 
invasive brain tumours reveals how these cancers spread, and how they 


resist treatment. SEE ARTICLE P.93 


HARALD SONTHEIMER 


ardly any diagnosis is as distressing as 
H that ofa primary brain tumour. These 

tumours, often known as gliomas, are 
a varied group that originate from immature 
stem cells or from glial cells, which support 
and protect neuronal networks throughout 
the brain’. Gliomas proliferate uncontrol- 
lably, destroying surrounding brain tissue 
and causing profound neurological damage. 
They are responsible for around 14,000 deaths 
each year in the United States alone’, and are 
almost always deadly, owing to their resistance 
to radiation therapy and ability to infiltrate 
healthy brain tissue. In this issue, Osswald 
et al.° (page 93) shed light on what confers 
these destructive abilities. 

Gliomas have long frustrated neuro- 
surgeons’, because cancerous cells invade the 
surrounding brain before diagnosis is possi- 
ble, making surgical removal of these tumours 
inefficient. The tumours move into the brain 


through extracellular spaces’, often following 
the outside of blood vessels or nerve tracts — 
in contrast to most other cancers, which dis- 
seminate through the blood or lymphatic 
systems. The invading cells must be killed if 
treatment is to be successful, and so patients 
typically undergo aggressive radiation therapy 
in combination with chemotherapy. 

Many gliomas, including the most malignant 
varieties, resist both radiation and chemo- 
therapy. But a small subgroup called oligo- 
dendrogliomas, which harbour deletions in 
two chromosomal regions dubbed 1p and 19q, 
respond well to radiation treatment and carry a 
more favourable prognosis®. Osswald and col- 
leagues set out to determine what accounts for 
this difference in radiation sensitivity. 

The authors labelled patient-derived glioma 
cells taken from a variety of tumours before 
transplanting them into the brains of mice. 
Using in vivo microscopy, they visualized 
tumour growth and invasion for up to one year 
through a window implanted in the animals’ 
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Figure 1 | A network of neighbours. Osswald et al.’ report that, in some types of brain tumour, structures 
called microtubes connect tumour cells, allowing them to act as a single, organism-like unit. Tumour 
microtubes facilitate invasion into healthy brain tissue. They permit the spread of toxic molecules such as 
calcium ions (Ca”*) that build up during radiation therapy, allowing the whole unit to share the burden of 
toxicity. Furthermore, if tumour tissue is surgically removed, newly synthesized nuclei donated by tumour 
cells can travel down the microtube to the cell-free site to form new tumour cells. 


heads. Invading glioma cells without the 
1p and 19q co-deletion extended long, thin, 
contractile processes into the surrounding 
brain tissue. The researchers called these struc- 
tures tumour microtubes. 

The microtubes were rich in the proteins 
actin and myosin, which are known’ to propel 
neuronal growth cones (structures at the tips 
of developing neurons that project out to seek 
other cells with which to connect). Some 
microtubes explored, and eventually invaded, 
the healthy brain. By contrast, others contacted 
neighbouring tumour cells, forming cytoplas- 
mic bridges between adjacent cells (Fig. 1). 
This effectively turned the cells into a single 
organismal unit (a syncytium). The connec- 
tions between microtubes from each tumour 
cell were formed by cytoplasm-filled pores 
called gap junctions, composed of hexameric 
Cx43 proteins. 

What advantage might the tumour gain 
from growing as a single organismal unit? 
Osswald et al. found that cells of the syncyt- 
ium were highly resistant to radiation therapy. 
In unconnected cells, radiation caused an 
increase in intracellular calcium-ion levels 
that triggered cell death. But in cells of the 
syncytium, Ca™* levels remained more stable, 
presumably because the extra Ca”* was dis- 
tributed among all cells. Furthermore, when 
the nucleus of a connected cell was ablated 
by laser, the neighbouring cell extended a 
microtube into the affected tissue to deliver a 
newly synthesized cell nucleus, thus replacing 
the dead neighbour with a newly nucleated 
cell — a remarkable feat. The connected 


cells can thus be thought of as forming a 
neighbourhood watch group, protecting one 
another by sharing toxic exposure and even 
going as far as replacing dead neighbours with 
part of themselves. 

These results indicate that those gliomas that 
are sensitive to radiation should not be part of 
a protective neighbourhood. Indeed, when 
Osswald et al. analysed human biopsies from 
oligodendrogliomas, fewer than 1% contained 
microtubes and Cx43 expression was almost 
absent. The authors also found that microtubes 
and Cx43 were lacking when they transplanted 
oligodendroglioma-derived cells harbouring 
the 1p and 19q co-deletion into mice. 

Because microtubes are at the heart of the 
glioma network, the authors next searched 
for genes that regulate microtube forma- 
tion. Through Ingenuity Pathway Analysis 
(a computer-aided method for detecting 
genes linked to biological traits), they iden- 
tified Gap-43 as a candidate. The GAP-43 
protein aids neuronal migration and the for- 
mation of neural growth cones during devel- 
opment’. It was prominently expressed on 
the invading tips of tumour microtubes, and 
was conspicuously absent in 1p and 19q co- 
deleted tumours. Moreover, forced expression 
of GAP-43 in oligodendroglioma-derived 
transplants produced highly invasive tumours 
that acted in protective, microtube-con- 
nected networks and resisted radiation. Thus, 
GAP-43 seems to mediate tumour-microtube 
formation. 

This carefully executed study advances our 
understanding of brain-tumour growth. For 
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instance, it is now clear that the gap junctions 
formed by Cx43 are central to the success of 
glioma neighbourhoods. This protein has 
been thought to act as a tumour suppressor, 
preventing cell division in cancers, includ- 
ing gliomas’. By contrast, Osswald and 
colleagues’ findings imply that tumour 
growth is enhanced by the presence of Cx43. 
This difference can be reconciled when one 
considers the organism-like growth of the 
syncytium — in this scenario, the connected 
cells protect and support one other, allowing 
the tumour mass to grow even when exposed 
to radiation. 

Although the authors hypothesize that the 
sharing of Ca” confers resistance to radiation, 
many other mechanisms could be at work. Gap 
junctions allow the passage of many macro- 
molecules between cells, including ATP, amino 
acids and even microRNAs. One molecule 
deserving of consideration is the antioxidant 
glutathione, which readily permeates gap junc- 
tions to directly protect cells from radiation 
damage’. 

From a therapeutic perspective, Cx43 is 
a challenging pharmacological target. The 
protein is expressed throughout the body, 
and is required both for glial transport of 
cellular metabolites, signals and waste prod- 
ucts, and to ensure that heart cells contract in 
synchrony. Drugs that block Cx43 have been 
used to heal chronic skin ulcers in humans” 
and to enhance the effectiveness of the chemo- 
therapeutic drug temozolamide in treating 
gliomas in mice’’. However, Osswald and col- 
leagues’ discovery that GAP-43 is responsible 
for establishing glioma networks might point 
to a more effective target for combating these 
destructive cancers. m 
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Managing nitrogen for sustainable 


development 


Xin Zhang!?, Eric A. Davidson’, Denise L. Mauzerall!4 


, Timothy D. Searchinger', Patrice Dumas”® & Ye Shen’ 


Improvements in nitrogen use efficiency in crop production are critical for addressing the triple challenges of food 
security, environmental degradation and climate change. Such improvements are conditional not only on technological 
innovation, but also on socio-economic factors that are at present poorly understood. Here we examine historical patterns 
of agricultural nitrogen -use efficiency and find a broad range of national approaches to agricultural development and 
related pollution. We analyse examples of nitrogen use and propose targets, by geographic region and crop type, to meet 
the 2050 global food demand projected by the Food and Agriculture Organization while also meeting the Sustainable 
Development Goals pertaining to agriculture recently adopted by the United Nations General Assembly. Furthermore, 
we discuss socio-economic policies and technological innovations that may help achieve them. 


ore than half the world’s people are nourished by crops grown 

with synthetic nitrogen (N) fertilizers, which were made possi- 

ble in the early twentieth century by the invention of the Haber- 
Bosch process, which reduces atmospheric nitrogen gas (Nz) to reactive 
forms of N (ref. 1). A reliable supply of N and other nutrients essential for 
plant growth has allowed farmers to increase crop production per unit 
land greatly over the past century, thus promoting economic develop- 
ment, allowing larger populations, and sparing forests that would proba- 
bly otherwise have been converted to agriculture to meet food demand”. 
Despite this progress, nearly one billion people remain undernourished’. 
In addition, the global population will increase by two to three billion by 
2050, implying that demands for N fertilizers and agricultural land are 
likely to grow substantially“. Although there are many causes of under- 
nourishment and poverty, careful N management will be needed to nour- 
ish a growing population while minimizing adverse environmental and 
health impacts. 

Unfortunately, unintended adverse environmental and human health 
impacts result from the escape of reactive N from agricultural soils, 
including groundwater contamination, eutrophication of freshwater 
and estuarine ecosystems, tropospheric pollution related to emissions 
of nitrogen oxides and ammonia gas, and accumulation of nitrous oxide, 
a potent greenhouse gas that depletes stratospheric ozone? (Fig. 1). 
Some of these environmental consequences, such as climate change and 
tropospheric ozone pollution, can also negatively affect crop yields!°"! 
and human health!”. Hence, too little N means lower crop productivity, 
poor human nutrition and soil degradation”, but too much N leads 
to environmental pollution and its concomitant threats to agricultural 
productivity, food security, ecosystem health, human health and eco- 
nomic prosperity. 

Improving nitrogen-use efficiency (NUE)—that is, the fraction of 
N input harvested as product—is one of the most effective means of 
increasing crop productivity while decreasing environmental deg- 
radation!*!>. Indeed, NUE has been proposed as an indicator for 
assessing progress in achieving the Sustainable Development Goals 
recently accepted by 193 countries of the United Nations General 
Assembly'®. Fortunately, we have a large and growing knowledge base 


and technological capacity for managing N in agriculture’’, and aware- 
ness is growing among both agricultural and environmental stakeholder 
groups that N use is both essential and problematic’’. This growing 
awareness, combined with ongoing advances in agricultural technology, 
is creating a possible turning point at which knowledge-based N man- 
agement could advance substantially throughout the world. However, 
improving NUE requires more than technical knowledge. The cultural, 
social and economic incentives for and impediments to farmer adoption 
of NUE technologies and best management practices need to be better 
understood”. 

Here we analyse historical patterns (1961-2011) of agricultural N use 
in 113 countries to demonstrate a broad range of pathways of socio- 
economic development and related N pollution. Our analysis suggests 
that many countries show a pattern similar to an environmental Kuznets 
curve (EKC), in which N pollution first increases and then decreases 
with economic growth'*!. So far, most EKC analyses have focused on 
pollution from industrial and transportation sectors!*”””%; the present 
study is one of a few that consider agricultural N pollution in the EKC 
context”*5, and apply it globally. However, patterns of N pollution are 
neither automatic nor inevitable. Socio-economic circumstances and 
policies vary widely among countries, affecting factors such as ferti- 
lizer to crop price ratios and crop mixes, which, as our analysis shows, 
influence the turning points of the EKC. Although technological and 
socio-economic opportunities for NUE improvement vary regionally, 
our analysis shows that average global NUE in crop production needs 
to improve from ~0.4 to ~0.7 to meet the dual goals of food security and 
environmental stewardship in 2050. 


Patterns of nitrogen pollution 

As a useful indicator of potential losses of N to the environment from 
agricultural soils”°?”, N surplus (Nour; in units of kg N ha~! yr~!) is 
defined as the sum of N inputs (fertilizer, manure, biologically fixed N, 
and N deposition) minus N outputs”®”? (the N removed within the har- 
vested crop products, Nyieia; Fig. 1). Some of the Neu, recycles within the 
soil, but most Nur is lost to the environment over the long term, because 
the difference between annual inputs and outputs is usually large relative 
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Figure 1 | An illustration of the N budget in crop production and 
resulting N species released to the environment. Inputs to agriculture are 
shown as blue arrows and harvest output as a green arrow. NUE is defined 
as the ratio of outputs (green) to inputs (blue) (ie. NUE = Nyieta/Ninput)- 
The difference between inputs and outputs is defined as Ny which is 
shown here as orange arrows for N losses to the environment and as 

N recycling within the soil (grey box) (that is, Nour = Ninput — Nyiela)- 
Abbreviations: ammonia (NH3), nitrogen oxides (NO,), nitrous oxide 
(N20), dinitrogen gas (Nz), ammonium (NH,"), nitrate (NO3_), dissolved 
organic nitrogen (DON) and particulate organic nitrogen (PON). 


to changes in soil N stocks. The related term of NUE, also called the 
output-input ratio of N, is mathematically defined as the dimensionless 
ratio of the sum of all N removed in harvest crop products (outputs 
or Nyieia) divided by the sum of all N inputs to a cropland*°! (Fig. 1). 
The Nou,» NUE and Nyieig terms can serve as environmental pollution, 
agricultural efficiency, and food security targets*”*°, respectively, which 
are inherently interconnected through their mathematical definitions? 


(that is, Ney, =N. -1}, see Supplementary Information 


1 
yield NUE 
section 1 for more information) and their real-world consequences (Fig. 1). 


Variable turning points on the EKC 
As an indicator of the extent of environmental degradation, Nour aggre- 
gated to a national average for all crops is closely related to income 
growth, mainly in two contrasting pathways as follows. On the one hand, 
increasing income enables demand for more food consumption*’, which 
can increase both the land area devoted to agriculture and the intensity 
of agricultural production and consequently results in more N lost to the 
environment. On the other hand, increasing income is often accompa- 
nied by a societal demand for improved environmental quality, such as 
clean water and clean air, and is also accompanied by access to advanced 
technology'*'°. Consequently, governments may impose regulatory poli- 
cies or offer subsidies and incentives targeted at reducing local or regional 
N pollution, and farmers may adopt more efficient technologies. 
Therefore, we hypothesize that N,u, follows a pattern similar to the 
EKC: Nour increases with income growth and the quest for food security 
at early stages of national agricultural development (first phase), but 
then decreases with further income growth during a more affluent stage 
(second phase), eventually approaching an asymptote determined by 
the theoretical limit of the NUE of the crop system (third phase, Fig. 2). 
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Figure 2 | An idealized EKC for Ngur and the related curve for NUE. 

a, The EKC for Ngu;. b, The curve for NUE, which is related to the EKC for 
Nour. The theoretical limit for NUE (assuming no soil mining of nutrients) 
is unknown, but no biological system is 100% efficient, so the hypothetical 
NUE limit is shown as close to but less than unity. 


Sustainable intensification of agriculture has been advanced as the key 
to achieving the second phase of the EKC, including use of cultivars 
best adapted to the local soil and climate conditions, improved water 
management, balancing N application with other nutrient amendments, 
precision timing and placement of fertilizer and manure applications 
to meet crop demands, the use of enhanced-efficiency fertilizers, and 
support tools to calculate proper dosing!*!7"4, While Ngur is the EKC 
environmental degradation indicator, the mathematical relationship 
between Ny, and NUE results in nearly mirror images in Fig. 2 (although 
see Supplementary Information section 1 for a discussion of situations 
in which Neu, and NUE can both increase simultaneously). 

Of the three phases of the Neur trend, it is the second phase of sus- 
tainable intensification with increasing affluence that is of greatest 
contemporary interest. The first phase of agricultural expansion is well 
documented*”?!, and the third phase cannot yet be evaluated. So far, 
no country has yet approached the third phase, nor do we know how 
close to 100% efficiency the use of N inputs could become. For the first 
phase, as incomes rise, virtually all countries initially increase fertilizer 
use, Nyicig and Nour while NUE decreases*”*!. To test the existence of the 
second phase, we examine whether the relationship between gross 
domestic product (GDP) per capita and Nyy, breaks away from the 
linearly (or exponentially) increasing trend and follows more of a 
bell-shaped pattern over the long term. 

We tested the existence of a sustainable intensification phase 
(or an EKC pattern) with a five-decade record (1961-2011) of Nour 
and GDP per capita”®*>-*° with a fixed effects model*!-* across 113 
countries for which sufficient data were available and a regression 
model for each individual country'*“*"* (see sections 1 and 2 in the 


© 2015 Macmillan Publishers Limited. All rights reserved 


Supplementary Information). The fixed effects model shows a signifi- 
cant quadratic relationship between GDP per capita and Nour (P< 0.001, 
Supplementary Table 9). Regressions between GDP per capita and Neur 
for each individual country fall into five response types (examples of each 
group are shown in Fig. 3). Of the 113 countries, 56 countries (group 1) 
show bell-shaped relationships between Ngur and GDP per capita, 
indicating that Nour increased and then levelled off or decreased as eco- 
nomic development proceeded, as expected for an EKC (two examples 
are illustrated in Fig. 3a). Those 56 countries account for about 87% of 
N fertilizer consumption and about 70% of harvested area of all 113 
countries. These data provide support for an EKC pattern for N pollution 
from agriculture, although as we show below, the potential causes of 
EKC shapes and turning points are complex. Furthermore, for 28 of the 
56 countries, by 2011 the rate of increase in Neur had only slowed or 
levelled off and had not yet actually decreased, indicating likely but still 
uncertain conformance with an EKC (Supplementary Tables 5 and 6). 
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Figure 3 | Examples of historical trends of the relationship between 

GDP per capita and Ngu;. The observations are the record of annual Nur 
smoothed using a ten-year window for each country; the model results are 
the outcome of the regression using the following model: Y=a+ bX + cX’, 
where the dependent variable Y is the country’s Neu, and the independent 
variable X is the country’s GDP per capita. We categorized the 113 countries 
into five groups, based on the significance (that is, P value) and sign of the 
regression coefficients b and c (see Supplementary Information sections 2.1 
and 3.1). a, France and USA are examples of group 1, which have significantly 
negative c (P;< 0.05 and c< 0), thus indicating that Nour has started to level 
off or has declined; b, Brazil, Thailand, Malawi and Algeria are examples of 
groups 2-5, which increase nonlinearly, increase linearly, have no significant 
correlation (P, > 0.05 and P.> 0.05), or have a negative surplus in 2007-2011, 
respectively (see Supplementary Tables 5 and 6). The results for all countries 
can be found in the figures in the Supplementary Information. 
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Countries with a linear or accelerating increase in Nour (group 3 
and most countries in group 2) as GDP per capita grew have not yet 
approached an EKC turning point (for example, Fig. 3b), but could still 
follow an EKC in the future as their N input growth slows and NUE 
increases. Most countries showing an insignificant (P > 0.05) relation- 
ship between Nou, and GDP per capita (group 4) or with a negative Neur 
(group 5) have had such little income growth and use so little N that the 
EKC concept cannot be evaluated yet owing to limited change in the 
country’s GDP per capita (for example, Fig. 3b). 

Classic empirical studies on EKC, such as Grossman and Krueger 
(ref. 19), have been criticized because of concerns regarding statistical 
analyses of time series data that may be non-stationary*””’. Therefore, 
we examined the stationarity of our data (Supplementary Table 7) and 
used the Autoregressive Distributed Lag modelling approach (ARDL)*, 
which is the most frequently used method for the co-integration 
test in EKC empirical studies published in the last decade*’, to test 
co-integration on a subset of the data. The ARDL regression mod- 
els showed the same long-term relationships between Nour and GDP 
per capita as presented above for all tested countries (Supplementary 
Table 8). The application of the ARDL method in EKC studies has 
also been criticized recently for including the quadratic term in the 
co-integration test, and some new methods have been proposed*!. 
Further evaluation is needed on the limitations and performance of the 
ARDL and newly proposed methods for EKC analyses. 

Another common criticism of the EKC concept is that the turning 
point for transitioning to declining environmental degradation is highly 
variable among pollutants and among countries'*°*°*, Consistent with 
those observations, no specific value of GDP per capita was a good pre- 
dictor of turning points for Neu on the EKC among countries in the pres- 
ent study. For example, Nour in Germany and France started to decline 
when GDP per capita reached about US$25,000 in the 1980s, while Nour 
in the USA levelled off and started to decline more recently when GDP 
per capita reached about US$40,000. Our analysis also shows that coun- 
tries have widely differing values of NUE and Nou; even when yields are 
similar. Some of this variation is probably due to underlying biophysical 
conditions, such as rainfall variability and soil quality, which influence 
crop choices, yield responses, and NUE. However, cultural, social, tech- 
nological, economic and policy factors also probably affect the turning 
points on the EKC trajectory of each country. 

The turning point in European Union (EU) countries appears to have 
been reached at least in part owing to policies*’. Beginning in the late 
1980s and through the early 2000s, increases in NUE and decreases in 
Nour in several EU countries coincided with changes in the EU Common 
Agricultural Policy, which reduced crop subsidies, and adoption of the 
EU Nitrates Directive, which limited manure application rates on crop- 
land**°”, Relying mostly on volunteer approaches in the USA, the level- 
ling off and modest decrease in Neu since the 1990s is largely the result 
of increasing crop yields while holding N inputs steady (Fig. 4a), which 
has resulted from improved crop varieties, increased irrigation and other 
technological improvements*””*. A few state regulatory programmes 
have required nutrient management plans, placed limitations on fertilizer 
application dates and amounts, and required soil and plant testing, with 
varying degrees of success**-©. Concerns about water and air quality, 
estuarine hypoxic zones, stratospheric ozone depletion, and climate 
change have also stimulated many outreach efforts by governments, 
fertilizer industry groups, retailers, and environmental organizations 
to provide farmers with information, training and innovative financial 
incentives to improve NUE voluntarily’), 


Fertilizer to crop price ratios 

Policy can affect NUE not only through regulation and outreach, but also 
by affecting prices at the farm gate. The ratio of fertilizer to crop prices, 
Ric, has been widely used in combination with data on yield responses 
to fertilizer application to advise farmers on fertilizer application rates 
that yield optimal economic returns®-®. In addition to influencing fer- 
tilizer application rates, Ry also affects farmer decisions regarding their 
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choice of technologies and practices for nutrient management, all of 
which affect NUE and Ngur (ref. 33). We tested whether the influence 
of Ry. appears at the national level using two methods: one examines 
the correlation coefficient of Rj and NUE for individual countries, and 
the other applies a fixed effects model to all data to test the correlation 
between Rg and NUE with and without including GDP per capita and 
crop mix (see section 2.3 in Supplementary Information). Because both 
the fertilizer and crop prices are ‘at the farm gate; they include the effects 
of government subsidies*. The results for maize, for which the most data 
are available, indicate that the fertilizer to maize price ratio is positively 
correlated with NUE using both statistical approaches (Supplementary 
Table 12). We also found that maize prices are linearly correlated with 
the prices of most major crops, so we infer that the fertilizer to maize 
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Figure 4 | A comparison of historical trends. a, Nationally averaged annual 
fertilization rates and yields of maize in China and the USA. b, NUE averaged 
across crops in China and the USA. ¢, Fertilizer to crop price ratios for 
China, India, USA and France. The dashed blue line in a shows a typical yield 
response function for maize based on fertilizer response trials**°’, which 
demonstrates diminishing return in yield as N inputs increase. Note that the 
historical trend for China follows a pattern similar to a typical yield response 
function, indicating that further increases in N application rates will result 

in diminishing yield returns in China. In contrast, maize yield has increased 
in the USA since 2001 without increasing nationally averaged N input rates, 
suggesting that the yield improvement has been achieved by adopting more 
efficient technologies or management practices that shift the yield response 
curve upwards**. The dashed pink line in b shows what the NUE in China 
would be if it achieved NUE values realized in the USA for all crops, but 

with the crop mix of China. The gap between the dashed pink line and the 
black line (USA record) is the difference in NUE between countries that is 
attributable to the differences in crop mixes. The fertilizer to crop price ratio 
shown in c is determined by the N price of urea divided by the N price of 
maize product (see section 1.6 in Supplementary Information for data sources 
and methodologies). The data are smoothed using a ten-year window. 


price ratio is likely to be a good index for the long-term trend of Ry for 
all crops. Indeed, we found a statistically significant (P < 0.001) positive 
correlation between historical values of Rg for maize and the NUE aggre- 
gated for all other crops. Moreover, this correlation is still statistically 
significant (P < 0.001) after adjusting for the effect of GDP per capita 
and crop mix (Supplementary Table 11). 

Increases in Rg since the 1990s, in both France and the USA 
(Fig. 4c), coincided with increases in NUE (ref. 57) and may have 
affected the EKC turning point. At the other extreme, both China and 
India have had declining values of Rg (Fig. 4c), owing to heavily sub- 
sidized fertilizer prices”. Fertilizer subsidies reached US$18 billion 
in China in 2010 (ref. 66). Rates of N inputs have now reached levels 
of diminishing returns for crop yield in China (Fig. 4a), and China has 
the largest Ngur and one of the lowest nationally averaged NUE values 
in the world (Table 1). The very low Ry in China incentivizes farmers to 
attempt to increase crop yield by simply adding more N or by choosing 
more N-demanding cropping systems (for example, change from cereal 
production to greenhouse vegetable production’) instead of adopting 
more N-efficient technologies and management practices. 

Not all fertilizer subsidies are inappropriate. Where infrastructure for 
producing and transporting fertilizers is poor, as is the case for most of 
Africa, the cost can be so high that fertilizer use is prohibitively expensive 
for smallholder farmers, resulting in low yield and small, even negative 
Nour (soil mining). In these cases, there is room for fertilizer subsidies 
to increase N inputs, because significant increases in N inputs could 
be absorbed and greatly increase crop yields without much immediate 
risk of N pollution®*-”°. When properly designed, temporary fertilizer 
subsidies structured to build up the private delivery network and with 
a built-in exit strategy can be an appropriate step’'. The longer-term 
question for these countries will be whether they can ‘tunnel through the 
EKC by shifting crop production directly from a low-yield, high- NUE 
status to a high-yield, high-NUE status. This shift will require leapfrog- 
ging over the historical evolution of agricultural management practices 
by employing technologies and management practices that promote high 
NUE before Nur grows to environmentally degrading levels. Acquiring 
and deploying such technologies, such as improved seed, balanced nutri- 
ent amendments, and water management, will require investments in 
technology transfer and capacity building. 


Importance of crop mix 

Another factor that may confound EKC trajectories is the mix of crops 
countries grow over time, which is affected by both demand and trade 
policies’”. For example, changing patterns of crop mixes help to explain 
some of the differences between China and the USA. Since the 1990s an 
increasing percentage of agricultural land in China has been devoted to 
fruit and vegetable production, and N application to fruits and vegetables 
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Table 1 | N budget and NUE in crop production by region and crop in 2010 and projected for 2050 


Current (2010) Projected (2050) 
Harvest N Input N Surplus N Projected harvest RequiredinputN — Resulting surplus N 
(Tg N yr) (Tg N yr!) NUE (Tg N yr“) N* (Tg N yr) Target NUE (Tg N yr!) (Tg N yr!) 
By regiont 
China 13 51 0.25 38 16 0.60 27 11 
India 8 25 0.30 18 11 0.60 19 8 
USA and Canada 14 21 0.68 7 19 0.75 25 6 
Europe 7 14 0.52 7 10 0.75 13 3 
Former Soviet Union 4 6 0.56 3 6 0.70 8 2 
Brazil 6 11 0.53 5 10 0.70 15 4 
Latin America (except Brazil) 7 12 0.52 6 10 0.70 15 4 
Middle East and North Africa 3 0.48 3 4 0.70 5 2 
Sub-Saharan Africa 4 0.72 2 9 0.70 13 4 
Other OECD countries 1 0.52 1 2 0.70 2 1 
Other Asian countries 8 19 0.41 11 10 0.60 17 7 
Total 74 174 0.42 100 107 0.67 160 52 
By crop typet 
Wheat 13 30 0.42 17 18 0.70 25 8 
Rice 11 29 0.39 18 14 0.60 23 9 
Maize 13 28 0.46 15 19 0.70 28 8 
Other cereal crops 5 9 0.53 4 7 0.70 3 
Soybean 16 20 0.80 4 24 0.85 28 4 
Oil palm 1 1 0.46 1 1 0.70 2 1 
Other oil seed 4 10 0.43 6 8 0.70 3 
Cotton 2 0.37 3 3 0.70 5 1 
Sugar crops 1 0.19 4 2 0.40 2 
Fruits and vegetables 3 25 0.14 21 5 0.40 7 
Other crops 5 11 0.41 7 ri 0.70 0) 3 
Total 74 174 0.42 100 107 0.68 157 50 


The 2010 record is aggregated from our N budget database (see Supplementary Information section 1 for detailed methodologies and data sources used in developing this database). The 2050 
projected harvest N is derived from a FAO projection of crop production to meet a scenario of global food demand®. The calculated target NUE values for 2050 are not meant to be prescriptive for 
particular countries or crops; rather, they are presented to illustrate the types of NUE values that would be needed, given this assumption of food demand, while limiting Ncur to near the lower bound 


(50Tg N yr~?) of allowable N pollution estimated in planetary boundary calculations”®. Harvest N, input N and surplus N values are rounded to the nearest Tg N yr-?. 


1 


*The projected harvest N is based on an FAO scenario? for 2050 that assumes a world population of 9.1 billion people and increases in average caloric consumption to 3,200 kcal per capita in Latin 
America, China, the near East and north Africa, and an increase to 2,700 kcal per capita in sub-Saharan Africa and India. Consumption of animal products increases in developing countries, but 


differences between regions remain. 
{The definitions of the country groups are in Supplementary Table 13. 


+The crop group is defined according to the International Fertilizer Industry Association’s report on fertilizer use by crop28. 


now accounts for about 30% of total fertilizer consumption?®”4 , with an 


average NUE of only about 0.10 (which is below the globally averaged 
NUE for fruits and vegetables of 0.14, and well below the global averages 
for other major crops; Table 1)”“7°. At the same time, China has been 
increasingly relying on imported soybeans, an N-fixing crop that has very 
low Ngur (Table 1)”°. In contrast, US soybean production has been growing 
and now accounts for about 30% of the harvested area for crop production 
(excluding land devoted to production of grasses or crops for feeding live- 
stock) in the USA. While fertilizer subsidies in China probably account 
for much of the low NUE there, our analysis shows that the difference 
in crop mix also accounts for nearly half of the NUE difference between 
China and USA (Fig. 4b). 

To address this issue globally, we tested the relationship between 
NUE and the fraction of harvested area for fruits and vegetables with a 
fixed effects model for the 113 countries (Supplementary Table 11). The 
fraction of harvested area for fruit and vegetable production negatively 
correlates with NUE, and that relationship is still significant (P < 0.001) 
even after adjusting for the effect of GDP per capita. 


Meeting the growing challenge 

Agriculture is currently facing unprecedented challenges globally. On 
one hand, crop production needs to increase by about 60%-100% from 
2007 to 2050 to meet global food demand*””~””. On the other hand, 


anthropogenic reactive N input to the biosphere has already exceeded a 
proposed planetary boundary**®, and the increasing demand for food and 
biofuel is likely to drive up N inputs even further. Therefore, it is critical 
to establish global and national goals for N use in crop production and to 
use those goals as reference points to evaluate progress made and guide 
NUE improvement. 


Global and national goals 

The planetary boundary for human use of reactive N that can be 
tolerated without causing unsustainable air and water pollution has 
been defined in mainly two ways: (1) as the maximum allowable 
amount of anthropogenic newly fixed N in agriculture that can be 
introduced into the earth system (62-82 Tg N yr !)>*°, and (2) as 
the maximum allowable Neu; released from agricultural production to 
the environment. 

Calculations of planetary boundaries according to the first defini- 
tion require assumptions about nutrient-use efficiency in agriculture. 
As NUE increases, more N inputs would be manageable while still 
remaining within air and water pollution limits because more applied 
N would be taken up by harvested crops. Therefore, rather than focus- 
ing ona planetary boundary of allowable newly fixed N, which varies 
depending on the NUE assumption, we follow the second approach, by 
estimating what NUE would be needed to produce the food demand 
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Figure 5 | Historical trends of Nyieias NUE and Nou, for a sample of 
countries examined in this study. The greyscale shows the level of Neu. 
The area covered in red indicates negative Nou, where the crop production 
is mining soil N. The data have been smoothed by ten years to limit the 
impact of year-to-year variation in weather conditions. Curves moving 


projected for 2050 (ref. 3; Table 1) while keeping Nour within the 
bounds estimated for acceptable air and water quality. Over 60% of 
N pollution is estimated to originate from crop production”, so this is 
the primary sector that must be addressed to reduce N pollution. From 
an analysis of the implications of N cycling in several “shared socio- 
economic pathways”*', Bodirsky et al. (ref. 78) calculated that global 
agricultural Nou: should not exceed about 50-100 Tg N yr~’. Therefore, 
we use 50Tg N yr ‘as an estimate of the global limit of Nu, from crop 
production. 

Meeting the 2050 food demand of 107 Tg N yr~! projected by the 
Food and Agriculture Organization (FAO, ref. 3) while reducing Ngur 
from the current 100Tg N yr~' to a global limit of 50 Tg N yr7! (ref. 78) 
requires very large across-the-board increases in NUE. Globally, NUE 
would increase from ~0.4 to ~0.7, while the crop yield would increase 
from 74Tg N yr‘ to 107 Tg N yr | (Table 1). Recognizing regional dif- 
ferences in crop production and development stage, this average could 
be achieved if average NUE rose to 0.75 in the EU and USA, to 0.60 in 
China and the rest of Asia (assuming they continue to have a high pro- 
portion of fruits and vegetables in their crop mix), and to 0.70 in other 
countries, including not dropping below 0.70 in sub-Saharan Africa as it 
develops (Table 1). Similarly, NUE targets could be established for indi- 
vidual crops, such as improving the global average from 0.14 to 0.40 for 
fruits and vegetables, and increasing the global average NUE for maize 
from 0.50 to 0.70 (Table 1). 

The challenges in achieving these ambitious goals differ among 
countries. Figure 5 shows the trajectories of major crop producing 
countries on the yield-NUE map for the last five decades. The x and 
y axes show the two efficiency terms in crop production, NUE and 
Nyieia. While the greyscale displays Ngu. To compare the nationally 
averaged field-scale (in units of kg N ha~' yr!) Nour in Fig. 5 toa 
global limit of 50-100 Tg N yr, the global average Nou; target would 
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towards the lower right indicate that those countries are achieving yield 
increases by sacrificing NUE and increasing Neu, whereas curves moving 
towards the upper right indicate countries achieving yield increases by 
increasing NUE and resulting in steady or decreasing Nour. 


need to be 39-78 kg N ha‘ yr‘ across the 2010 harvested area of 
1.3 billion hectares. For the examples shown, the USA, France, and 
Brazil appear to be on this trajectory, although further progress is 
still needed. In contrast, China and India not only have not yet found 
an EKC turning point, but also have much ground to make up to 
reduce their Ngur once they turn the corner on their EKC. Although 
a great challenge, this could also be seen as an opportunity to reduce 
fertilizer expenditures while increasing agricultural productivity. 
Malawi, like many sub-Saharan African countries and other least 
developed countries, has been on a classic downward trajectory of 
decreasing NUE as it has started to increase N inputs, although evi- 
dence from recent years suggests that this decline may have reversed, 
which would be a necessary first step to tunnelling through the EKC 


(Fig. 5). 


Achieving NUE targets 

Achieving ambitious NUE targets while also increasing yields to meet 
future food demands requires implementation of technologies and man- 
agement practices at the farm scale, which has been described widely and 
in considerable detail in the agricultural, environmental, and develop- 
ment literature’’. Some common principles include the ‘4Rs’ approach 
of applying the right source, at the right rate, at the right time, in the right 
place**. However, the technologies and management practices needed to 
achieve the 4Rs vary regionally depending on the local cropping systems, 
soil types, climate and socio-economic situations. Where improvements 
in plant breeding, irrigation, and application of available 4R technologies 
have already made large gains, new technological developments may be 
needed to achieve further gains, such as more affordable slow-release 
fertilizers, nitrification and urease inhibitors, fertigation (that is, apply- 
ing fertilizer via irrigation water), and high-tech approaches to precision 
agriculture®®. 
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It is promising that the development and the combination of informa- 
tion technology, remote sensing, and ground measurements will make 
information about precision farming more readily available, accessible, 
affordable and site-specific*”. In many cases, large gains could still be 
made with more widespread adoption of existing technologies, but 
a myriad of social and economic factors affecting farmer decision- 
making regarding nutrient management have only recently begun to 
receive attention and are critical in improving NUE (ref. 15). Socio- 
economic impediments, often related to cost and perceived risk, as well 
as lack of trust in recommendations by agricultural extension agents, 
often discourage farmers from adopting improved nutrient manage- 
ment practices?**3*4, Experience has shown that tailoring regulations, 
incentives, and outreach to local conditions, administered and enforced 
by local entities, and supported by trust established among local stake- 
holders improve the success of efforts designed to increase NUE (ref. 15). 

Although much of the work must be done at the farm scale, there 
are important policies that should be implemented on national and 
multi-national scales. First, improving NUE should be adopted as one 
of the indicators of the Sustainable Development Goals’® and should 
be used in conjunction with crop yield and perhaps other soil health 
parameters to measure the sustainability of agricultural development. 
To report reliably on a NUE indicator, countries should be strongly 
encouraged to collect data routinely on their N management in crop 
and livestock production. These data should be used to trace trajectories 
of the three indices of agricultural N pollution, agricultural efficiency 
and food security targets (that is, Nou NUE and Nyield), aS we have done 
here (Fig. 5) to demonstrate where progress is being made and where 
stronger local efforts are needed. The data used to construct Fig. 5 have 
served to demonstrate trends, but both improved data quality and inter- 
national harmonization of data standards are needed. Regular attention 
should be given to these trends to establish national and local targets and 
policies. Just as protocols established by the Intergovernmental Panel on 
Climate Change permit nations to gauge their progress and commitment 
for reducing greenhouse gas emissions, protocols for measuring and 
reporting on a Sustainable Development Goal pertaining to NUE could 
enable governments to assess their progress in achieving food security 
goals while maintaining environmental quality. 

Second, nutrient management in livestock operations and human die- 
tary choices needs more attention. Here we have focused entirely on crop 
production, largely because of availability of data, but the Nou» NUE and 
Nyieia indices are equally important in livestock management®. Indeed, 
soybeans and some cereals have high NUE as crops, but when fed to 
livestock, efficient recycling of the N in manure is challenging, resulting 
in lower integrated NUE for the crop-livestock production system**. The 
crop production scenario used here for 2050 (Table 1) makes assump- 
tions about future dietary choices’, which are beyond the scope of this 
study, but we note that future trends in diet will affect the demand for 
crop and livestock products, the crop mixes grown, and hence the NUE 
and Nou; of future agricultural systems”. 

Third, a similar approach to efficiency analysis would also be 
valuable for phosphorus (P) fertilizer management, interactions of 
N and P management, and reducing both N and P loading into aquatic 
ecosystems®”-*°, 

Fourth, national and international communities should facilitate tech- 
nology transfer and promote agricultural innovation. Stronger interna- 
tional collaborations and investments in research, extension and human 
resources are urgently needed so that knowledge and experience can be 
shared, creating political and market environments that help to incen- 
tivize the development and implementation of more efficient technolo- 
gies. Technology transfer and capacity building will be needed to enable 
sub-Saharan African countries to tunnel through the EKC (Fig. 5). 

These solutions to improving NUE will require cross-disciplinary 
and cross-sectorial partnerships, such as: (1) integrating research and 
development of innovative agricultural technology and management 
systems with socio-economic research and the outreach needed for such 
innovations to be socially and economically viable and readily adopted 
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by farmers; (2) analysing the nexus of food, water, nutrients and energy 
management to avoid pollution swapping (a measure designed to address 
one pollution problem leads to another; for example, retaining crop resi- 
dues can reduce nitrogen runoff, but may lead to higher NO emission®') 
and to optimize the net benefits to farmers, the environment and society; 
(3) promoting knowledge and data sharing among private and public 
sectors to advance science-based nutrient management; and (4) training 
the next generation of interdisciplinary agronomic and environmental 
scientists equipped with broad perspectives and skills pertaining to food, 
water, energy and environment issues. 

The EKC has often been described as an optimist’s view of a world 
with declining environmental degradation. Here we have shown that 
there is evidence—indeed, there is hope—for the EKC pattern of declin- 
ing N pollution with improving efficiencies in agriculture. However, we 
have also shown that continuation of the progress made so far is neither 
inevitable nor is it sufficient to achieve the projected 2050 goals of both 
food security and environmental stewardship. Turning points and trajec- 
tories of national agricultural EKCs will depend largely on agricultural, 
economic, environmental, educational and trade policies, and these will 
largely dictate the food and pollution outputs of future agriculture. 
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The contentious nature of soil organic 


matter 
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The exchange of nutrients, energy and carbon between soil organic matter, the soil environment, aquatic systems and 
the atmosphere is important for agricultural productivity, water quality and climate. Long-standing theory suggests that 
soil organic matter is composed of inherently stable and chemically unique compounds. Here we argue that the available 
evidence does not support the formation of large-molecular-size and persistent ‘humic substances’ in soils. Instead, soil 
organic matter is a continuum of progressively decomposing organic compounds. We discuss implications of this view of 
the nature of soil organic matter for aquatic health, soil carbon-climate interactions and land management. 


oil organic matter contains more organic carbon than global vege- 

tation and the atmosphere combined (Fig. 1). For this reason, the 

release and conversion into carbon dioxide or methane of even a 
small proportion of carbon contained in soil organic matter can cause 
quantitatively relevant variations in the atmospheric concentrations of 
these greenhouse gases’. Moreover, organic matter retains nutrients as 
well as pollutants in the soil, which improves plant growth and protects 
water quality”. Soils are also an important source of aquatic carbon, with 
implications for biogeochemical processes in rivers, lakes and estuaries’. 
Despite its recognized importance, there is a widely divergent view of the 
nature of soil organic matter. 

Biological, physical and chemical transformation processes convert 
dead plant material into organic products that are able to form intimate 
associations with soil minerals, making it difficult to study the nature 
of soil organic matter. Early research based on an extraction method 
assumed that a ‘humification’ process creates recalcitrant (resistant to 
decomposition) and large ‘humic substances’ to make up the majority of 
soil ‘humus (see Box 1). However, these ‘humic substances’ have not been 
observed by modern analytic techniques. This lack of evidence means 
that ‘humificatiom is increasingly questioned, yet the underlying theory 
persists in the contemporary literature, including current textbooks**. 

Here we argue in favour of a soil continuum model (SCM) that focuses 
on the ability of decomposer organisms to access soil organic matter and 
on the protection of organic matter from decomposition provided by soil 
minerals. Viewing soil organic matter as a continuum spanning the full 
range from intact plant material to highly oxidized carbon in carboxylic 
acids’ represents robust science and will facilitate the way we communicate 
between disciplines and with the public. Only such an evidence-based 
approach can allow for the development of mechanistic solutions to cli- 
mate, water quality and soil productivity issues (Fig. 1). The resulting 
knowledge should be integrated into conceptual and mechanistic models 
for the purpose of predicting carbon dioxide emissions from soils in a 
warming world, as well as of keeping water supplies clean, and of improv- 
ing and sustaining the ecosystem services of the world’s soils. Research 
aimed at reliable predictions of soil organic matter turnover should focus 
on investigating its spatial arrangement within the mineral matrix, the 
fine-scale redox environment, microbial ecology and interaction with min- 
eral surfaces under moisture and temperature conditions observed in soils. 


Traditional view 
Applies solubility in alkaline 
solution as criterion; over- or 
underestimates reactivity 
in water (electron shuttling, 
metal adsorption) 


Traditional view 
Relies on organic matter quality 
for prediction of emissions; 
assumes greater temperature 
sensitivity of persistent organic 
matter 


Atmospheric CO, 
829 


123 


59* 
Vegetation 


Traditional view 
Relies on formation of 
stable ‘humus’; 
observes organic matter 
properties in alkaline extracts 


Figure 1 | Traditional and emergent views of the nature of soil organic 
matter affect how we predict and manage soil, air and water. Traditional 
‘humification’ concepts limit observations of soil organic matter to its 
solubility in alkaline extracts, unlike the emergent view of organic matter 
based on solubility in water and its accessibility to microorganisms. Soils 
are an important source of organic matter in aquatic ecosystems and are 
responsible for half of the atmospheric carbon recycling. Carbon stocks 
and flux values are from ref. 1, except where noted otherwise: brown 
numbers are stocks in Pg C and blue numbers are flows in Pg C yr". 
*Disaggregated value from 119 Pg C yr“! total emissions. +3% of total 
carbon consumed by fire’*. ¢Estimate to balance soil carbon exports. 
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Historical reliance on an operational proxy 

Soil organic matter research is difficult because organic compounds are 
thoroughly mixed with and often adhere to soil minerals. In arable soil, 
organic matter typically makes up less than 5% and could historically 
be discerned only by its dark coloration. Before advanced spectroscopic 
methods became available in the early 1990s, research on soil organic 
matter required that the organic phase be separated from the mineral 
phase through an extraction procedure. The most efficient of these sepa- 
ration procedures in terms of mass extracted® is an extraction with alkali 
(Box 1), which dates back to a report published in 1786 (ref. 9). Although 
the extraction is incomplete, selective and prone to creating artefacts 
(Box 1), the procedure became widely adopted and its products univer- 
sally accepted as experimental proxies for soil organic matter. 

Concerns that alkaline preparations are not appropriate representatives 
of soil organic matter were raised as early as 1888 (ref. 10) and 50 years 
later it was proposed"! that ‘humic’ nomenclature should be dropped 
because the term relates only to a material obtained by a specific proce- 
dure. Unfortunately, these concerns were dismissed rather than disproved. 
Among the thousands of publications on ‘humic substances; not one inde- 
pendently confirms—for example, by direct spectroscopic observation— 
that the ‘humic substances’ extracted by alkali are components of organic 
matter that exist separately in soil environments. 

Among the strongest arguments in favour of discarding the notion of 
‘humic substances’ is the absence of any agreement within the broader 
scientific community on how such materials are defined. ‘Humic sub- 
stances’ may be described in the soil sciences in three different ways: 
strictly operationally according to what can be extracted with an alkaline 
solution, with further subcategories of ‘humic and ‘fulvic acids as well 
as unextractable ‘humins’; as an existing substance that is not merely an 
operational construct; or as a combination of the two (Box 1). Different 
research communities use the same vocabulary with very different con- 
notations, to the point of being contradictory: in soil science, ‘humic 
substances’ are thought to have large molecular masses’; in the envi- 
ronmental sciences, they are characterized as small fragments); anda 
classic textbook of aquatic geochemistry describes them as compounds 
of variable mass and composition’. These views have evolved over time, 
so that now it is not obvious what the term ‘humic substances is intended 
to convey unless it is explicitly defined. Despite this uncertainty and 
new insight from modern spectroscopic techniques (Box 2), the prod- 
ucts of alkaline extraction continue to be treated as physically existing 
entities*®!°, with research efforts focused on aligning theory with the 
behaviour and properties of a soil component proxy that is defined solely 
by solubility at an alkaline pH. 


Reconciling models of soil organic matter 

At present, three competing models for the fate of organic inputs to soil 
can be distinguished: (1) classic ‘humification; (2) ‘selective preservation 
and (3) ‘progressive decomposition (Fig. 2). 

All three models assume that fragments of plants and soil fauna are first 
broken up into small pieces at the onset of decomposition. Evidence that 
such breakdown of dead leaves or roots takes place comes from the obser- 
vation that the majority of organic matter inputs to soil decays within the 
first year’. It is further known that plant residues must be degraded by 
enzymes to a relatively small size (typically less than 600 Da) before they 
can be actively transported across the cell walls of microorganisms!”""*. 
In terrestrial ecosystems, so-called exo-enzymes perform this function 
outside the microorganism!””°. Thus, at any time within a living soil, a 
continuum exists of many different organic compounds at various stages of 
decay”, moving down a thermodynamic gradient from large and energy- 
rich compounds to smaller energy-poor compounds”. 

(1) The ‘humification’ model is the oldest of the three concepts”. In 
its original definition ‘humification’ assumes a further transformation or 
synthesis of the initial decomposition products into large, dark-coloured 
compounds” (Fig. 3). The resulting macromolecules were thought to be 
rich in carbon and nitrogen structures specific to ‘humification, resist- 
ant to decomposition!” and consequently, older than the rest of the soil 
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BOX | 
Traditional approach to the study 
of soil organic matter 


Since first used over 200 years ago, the alkaline extraction 
technique has undergone many iterations but the principle 

has remained identical. In its modern version!©, the procedure 
involves the addition of a sodium hydroxide solution with a very 
high pH of 13 to a soil sample. At this pH, most oxygen-containing 
functional groups in organic matter are ionized, making organic 
compounds bearing such groups much more soluble in water®’. 
After adding protons to the solubilized organic materials, a dark 
solid precipitates that is commonly called ‘humic acid’. The 
organic matter that remains soluble after reacidification is called 
‘fulvic acid’. The considerable proportion of organic matter that 
does not respond to the treatment, either for a lack of ionizable 
functional groups or because it was shielded from the harsh 
alkaline treatment by mineral protection, is named ‘humin’. 

This multi-step procedure created the need to distinguish 
several categories of what constitutes soil organic matter. These 
categories vary widely between authors. The conceptual problem 
with defining ‘humic substances’ by an extraction procedure is 
threefold: 

(1) The extraction is always incomplete, leaving 50%-70% 
of the organic carbon unextracted, which is then defined as 
the insoluble ‘humin’ fraction!°®. This precludes the use of the 
extractable ‘humic and fulvic acids’ as true representatives 
of total soil organic matter. The alkaline solution will also 
extract portions of soil fractions that are not meant to be 
included in ‘humic substances’, such as living biomass, 
simple and identifiable biomolecules (often included as ‘non- 
humic’ substances in ‘humus’), dissolved organic matter or 
undecomposed leaves and roots (isolated as particulates). 
How these separately assessed fractions should be 
distinguished from the unextracted ‘humins’ (that are part 
of ‘humic substances’) is often unclear. The sum of ‘humic’ 
and ‘non-humic’ substances is defined as ‘humus’, a term 
that is sometimes considered to be synonymous to soil 
organic matter®®, sometimes not!9, and is sometimes not 
used at all45822, 

(2) The harsh alkaline treatment at pH 13 ionizes 
compounds that would never dissociate within the wider soil 
pH range (pH 3.5 to pH 8.5), giving the resulting ‘humic’ and 
‘fulvic’ fractions the character of highly selective preparations 
with an exaggerated chemical reactivity rather than that of 
true isolates. 

(3) The development of this extraction method preceded 
theory, tempting scientists to develop explanations for the 
synthesis of materials resembling operationally extracted 
‘humic substances’, rather than to develop an understanding 
of the nature of all organic matter in soil. Over time, 
this attempt to mechanistically explain the formation of 
operationally defined ‘humic substances’ also led to their 
definition as synthesis products without the link to the alkaline 
extraction®®, 


organic matter. Given the lack of a universally accepted definition of 
‘humic substances’ across disciplines and the lack of evidence for their 
physical existence independent of the alkaline extraction procedure, it is 
no surprise that there is no agreement on the processes and pathways of 
‘humic substance formation either (Box 2). These ‘humic substances are 
variously considered to be ecologically useful (providing cation exchange 
capacity), chemically reactive (interacting with iron, aluminium and other 
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BOX 2 
Critique of the ‘humification’ model 


A consolidated assessment of published evidence (Fig. 3) reveals 
that secondary synthesis of ‘humic substances’ facilitated by 
minerals or enzymes has not been shown to be relevant in 
natural systems. On these grounds we find it inadvisable to 
support the classic ‘humification’ model. Evidence based on 
isotopic labelling!°” or on the testing of numerous decomposer 
organisms!° leaves little doubt that the supposedly recalcitrant 
‘humic substances’ can be decomposed at surprisingly 

ast rates. The dark colour of ‘humic’ extracts generated in 
aboratory experiments!©?1! can be satisfactorily explained 

by a combination of two processes: the degradation of natural 
pigments and the accumulation of molecules containing random 
conjugated bonds (which appear dark in the mixture). Large 
molecular masses of hundreds to millions of daltons (mostly 
10,000-100,000 daltons) reported in early studies!* have more 
recently been found to consist of self-assembled aggregates of 
small compounds mimicking large molecules!*41!!_ Contrary 
to many earlier interpretations, the old radiocarbon age of some 
alkaline extracts!! is not a valid criterion for the persistence of 
decomposed organic matter, but merely an indication of when 
the carbon was fixed by photosynthesis!!%. The chemical 
structures of so-called polyaromatic carbon compounds (carbon 
in ring structures) often observed in the extracts are routinely 
produced by both plants and microorganisms and include 
melanins, tannins and antibiotics (polyketides)!!*1!5. However, 
these compounds have a clear physiological purpose and are 
therefore not the products of a random decomposition process. 
Ubiquitous thermally altered carbon from vegetation fires found 
in most soils!!®1!8 is also polyaromatic, and a portion of such 
compounds is typically extracted in alkaline solution®?1!9. 
Heterocyclic nitrogen (nitrogen embedded in a carbon ring 
structure) has been proposed to result from secondary synthesis, 
but evidence is only available to demonstrate its origin from 
fires!@° or from artefacts during analyses!>!2!. The glass transition 
sometimes observed in materials from alkaline extracts!@? has 
been attributed to ‘humification’!3, because glass transition 
behaviour requires a degree of molecular order. But the glass 
transition can also be found in many microbial products!#* and 
fire-altered organic matter!2° (in which the processes are well 
established). 


metals), and—particularly relevant for biogeochemical models—also 
inherently ‘stable’ against further decomposition”. The suite of hypothet- 
ical transformation processes became collectively known as ‘humification’ 
and is also called the ‘synthesis concept of the genesis of humic substances’ 
or ‘secondary synthesis’!*!>?34 (Fig, 2). 

(2) ‘Selective preservation, which is also called preferential decom- 
position», is a newer concept informed by decomposition studies of 
leaves?’ and visible plant fragments in soils*®. This concept assumes 
that organic inputs are composed of both labile and relatively recalcitrant 
compounds”, the latter being used by microorganisms only when the 
former are exhausted. However, there is now robust evidence that, under 
suitable conditions, appropriately adapted decomposer organisms have 
the ability to decompose even presumably persistent materials more 
quickly than previously anticipated, including polycondensed aromat- 
ics*°, alkanes in soil*!, fire-derived carbon’, crude oil in sea water’, 
and even polyethylene**. Also, contrary to previous assumptions*, the 
decomposition of presumably recalcitrant lignin is fastest at the early 
stages of decomposition, as long as it is easily accessible and small 
organic molecules are available as a source of energy to help mineralize 
the lignin*®. 


62 | NATURE | VOL 528 | 3 DECEMBER 2015 


(3) In the progressive decomposition model (also called ‘biopolymer 
degradation’; or ‘the degradative concept’!*”*), soil organic matter 
consists of a range of organic fragments and microbial products of all 
sizes at various stages of decomposition”** (Fig. 2). Several independ- 
ent lines of evidence revealed alkali-extracted ‘humic substances’ to 
be a mixture of identifiable compounds such as fragments of plants 
or microorganisms*?~*! that are distributed in different locations of 
micro-aggregates”**, showing no similarity to the ‘humic’ extract*”, and 
having small size***’, Upon cell death, materials that are synthesized in 
the course of microbial anabolism are released into the soil, where they are 
subject to further degradation. Throughout this process, these materials 
remain on an energetic downhill trajectory*’, as opposed to the hypo- 
thetical ‘humic substances’ (Fig. 2), whose ‘secondary synthesis’ would 
require energy investments for which no thermodynamic rationale has 
been provided so far’. 

Using recognized chemical, physical and biological controls on soil 
carbon turnover, the available evidence can reconcile those existing 
theories into a SCM (Fig. 2). In the SCM concept, organic matter exists 
as a continuum of organic fragments that are continuously processed by 
the decomposer community towards smaller molecular size””°!. The 
breakdown of large molecules leads to a decrease in the size of primary 
plant material with concurrent increases in polar and ionizable groups, 
and thus to increased solubility in water. At the same time, the oppor- 
tunity for protection against further decomposition increases through 
greater reactivity towards mineral surfaces and incorporation into 
aggregates (Fig. 2). Modern analytical tools for the characterization of 
biomolecules in microbial cells and soils now suggest a direct and rapid 
contribution of microbial cell walls to soil organic matter protected by 
interaction with minerals*”°. Adsorption may be followed by desorption, 
exchange reactions with competing organic compounds, and biotic or 
abiotic degradation. An obvious consequence of microbial involvement 
in the decomposition process is the direct deposition of microbial cells, 
cell debris, exopolysaccharides, and root exudates on mineral surfaces. 

Only the SCM explains the variations in turnover time of organic 
compounds through variations in the presence or absence of decom- 
poser organisms and enzymes and the energy they require, through the 
properties and abundance of mineral surfaces that may protect organic 
matter, and through the availability of numerous other resources (such 
as oxygen and nutrients)°">*. The vast portfolio of options for variations 
in carbon turnover dynamics in the SCM provides a full explanation of 
organic matter properties as observed by contemporary, in situ spectro- 
microscopic techniques*”-” without invoking ‘humification’ processes or 
‘humic substances. Consequently, the SCM does not require microbial 
or abiotic generation of recalcitrance through the formation of specific 
organic compounds and is in agreement with the stated need to focus 
on spatial arrangement of soil organic matter°’ and environmental con- 
trol such as temperature, moisture or soil mineralogy*’. Decomposition 
pathways, sequences and rates therefore evolve as a specific function of a 
given soil system. The SCM offers a way forward in modelling soil carbon 
dynamics and developing soil management that is based on observable 
evidence, as discussed below. 


Environmental relevance 

The SCM view of the nature of soil organic matter—which excludes 
any secondary synthesis of ‘humic substances’—has implications for a 
range of disciplines that build on the science of organic matter properties 
and changes in soil (Fig. 1). This is all the more important as the ‘humic 
substances’ concept is very widely adopted outside the soil sciences, with 
the majority of publications focusing on ‘humic substances’ published in 
journals that do not explicitly cover soil science. 


Soil carbon modelling 

Soils contain more organic carbon than the atmosphere and vegetation 
combined! and predictions of soil organic matter dynamics could there- 
fore greatly influence forecasts of global climate change. Major soil carbon 
models such as Century™ or RothC” are built on the premise that soil 
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Figure 2 | Reconciliation of current conceptual models for the fate of 
organic debris into a consolidated view of a SCM of organic matter 
cycles and ecosystem controls in soil. Classic ‘humificatio relies on 
the synthesis of large molecules from decomposition products. Selective 
preservation assumes that some organic materials are preferentially 
mineralized, leaving intrinsically ‘stable’ decomposition products behind. 
Progressive decomposition reflects the concept of microbial processing 
of large plant biopolymers to smaller molecules. In the proposed SCM, 

a continuum of organic fragments is continuously processed by the 
decomposer community from large plant and animal residues towards 


organic matter can be divided into pools that have different turnover 
times. None of these models explicitly represents the characteristic pro- 
cesses of carbon transformation detailed in the SCM, such as adsorption 
and protection, desorption, and microbial activity. Although carbon 
movement between pools and their decomposition rates are modified 
by temperature, texture and moisture, the default turnover rates asso- 
ciated with individual carbon pools are justified by the combined influ- 
ence of physical protection and an inferred resistance to decomposition 


smaller molecular size. At the same time, greater oxidation of the organic 
materials increases solubility in water as well as the opportunity for 
protection against further decomposition through greater reactivity 
towards mineral surfaces and incorporation into aggregates. Dashed 
arrow lines denote mainly abiotic transfer, solid lines denote mainly biotic 
transfer; thicker lines indicate more rapid rates; larger boxes and ends 

of wedges illustrate greater pool sizes; all differences are illustrative. 

All arrows represent processes that are a function of temperature, moisture 
and the biota present. 


that is dependent on substrate quality (quality is here used in the sense 
of molecular composition of the organic matter). Particularly for the 
‘slow’ and ‘passive’ pools, this inherent resistance to decomposition 
(recalcitrance) has been understood to be the result of ‘humification, with 
the RothC model explicitly including ‘humus’ fractions”’. Lack of mech- 
anistic representation of the decomposition process produces disagree- 
ment among models” and between model predictions and observational 
data®”®, 
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Figure 3 | Weighing up the empirical information supporting either the 
historic or evidence-based interpretation of the nature of soil organic 
matter. A consolidated assessment of scientific evidence published over 
the past two decades provides explanations for the properties of alkaline 
extracts that do not require invoking the secondary synthesis of ‘humic 
substances. 


The shortcomings become apparent when these models are applied 
to predict the global warming feedback of soil organic carbon miner- 
alization. Rising temperatures increase microbial activity and a warm- 
ing atmosphere may therefore lead to greater mineralization of soil 
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organic carbon®’. The resultant carbon dioxide emissions would then 
accelerate the greenhouse effect and thereby increase global temper- 
ature. Soil organic matter pools with slower turnover are thought to 
respond more sensitively to climate warming than those with fast turn- 
over*?*!, The underlying, so-called carbon-quality-temperature theory 
(CQT theory) combines classical ‘humificatiom theory, that is, the 
assumption that decomposition creates complex, recalcitrant compounds, 
with the Arrhenius theory that chemical reactions are faster at higher tem- 
peratures™. According to CQT theory, the decomposition of a complex 
substrate requires more enzymatic reactions and a higher total activation 
energy than a reaction metabolizing a simple carbon substrate, and as a 
result, would be more sensitive to rising temperatures than the decom- 
position of a simple carbon substrate. The CQT theory loses much of 
its explanatory potential for the carbon pools with slow turnover if the 
decomposition of organic matter is not creating complex and recalcitrant 
compounds. 

Different organic compounds entering the soil have highly varying 
composition and in isolation (for example, fresh litter) have differ- 
ent turnover and hence temperature responses as a function of their 
composition®. However, this variation is so heavily influenced by 
environmental and biotic factors after they enter the soil ecosystem that 
the concept of relying on quality-dependent temperature responses is, in 
our opinion, obsolete. We propose that future research should concentrate 
to a much greater extent on the causes of any observed substrate prefer- 
ences, such as the absence of a decomposer with a matching catabolic 
toolbox or the lack of a critical resource for the decomposer. 

To equip models with more appropriate temperature responses, new 
approaches need to recognize first the continuum of organic compounds 
(rather than discreet pools with different turnover times), and second 
the protection of organic compounds (rather than substrate quality). It 
is not obvious that merely distinguishing between the mineralization 
of plant litter on the one hand and degradation products interacting 
with the mineral matrix® on the other will generate better predictive 
capabilities, simply because they form a continuum. In addition, the 
full suite of controls on mineralization must be considered, notably 
temperature-moisture interactions®. Mechanistic understanding in 
this field will be greatly improved if ‘humification’-derived assumptions 
about the molecular structure of the slower-cycling soil carbon pools are 
replaced by considerations of the processes that render organic decom- 
position fragments mobile in soil solution. The relevance of binding 
mechanisms of organic substances to different mineral surfaces is still 
uncertain” and the stability of minerals themselves may change as a result 
of exposure to organic compounds, such as those released by roots. 

The laudable efforts to include microbial activity and diversity”’ into 
soil carbon models to improve climate predictions continue to focus on the 
quality of organic matter. The development of models built on microbial 
ecology should omit any emphasis on substrate quality and especially 
the proposed large ‘humified’ organic compounds. Observations in soils 
depleted of plant litter input showed microbial communities adapted to 
metabolizing simple, small compounds rather than the large and poly- 
meric organic compounds expected for old and persistent soil organic 
carbon”'. To predict the responses of soil organic carbon to climate warm- 
ing, models must move beyond conceptual pools having different turn- 
over times and instead combine soil physical principles into soil biological 
processes. As recently demonstrated”, aspects of this combination are 
already possible when models include the extent to which the mobility 
of organic fragments in soil water affects accessibility of decomposition 
products by functionally different groups of microorganisms. 

It will next be critical to develop models that provide deeper insight 
into microbial access to soil organic carbon by including the spatial 
architecture of the soil®’. Such model development benefits from spatial 
data, which are becoming available using imaging analyses in two?” or 
three dimensions”. Ina fully developed model, this will require extensive 
computing capabilities and may only be possible if this research is priorit- 
ized or at a time when further computational advances make complex 
spatial calculations easily accessible and inexpensive. Combining these 
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approaches within the SCM would provide opportunities to test whether 
the distance of microorganisms from the organic matter plays as impor- 
tant a part as does the attachment of organic matter to protective mineral 
surfaces, which constitutes the next frontier in better understanding and 
prediction of soil organic carbon dynamics. 


Aquatic systems 

Because soil organic matter is a major source of organic carbon in rivers, 
lakes and estuaries’, its persistence and retention is of great interest for 
closing global carbon budgets’. Large proportions of organic carbon in 
rivers are mineralized and emitted as carbon dioxide”* or retained in 
fluvial”> and oceanic sediments”®. To date, ‘humic substances’ as extracted 
by alkali constitute the organic workhorse that is investigated by the com- 
munity of aquatic chemists. Continuing this practice of investigating 
organic matter in aquatic systems with the help of an inadequate proxy 
will not only prevent us from obtaining a better understanding of how far 
organic matter is transported and when it outgasses into the atmosphere, 
it will also generate misleading conclusions about is stability and reactiv- 
ity’. As outlined above for the soil environment, we argue that the persis- 
tence and movement of terrestrially derived organic carbon compounds 
entering aquatic ecosystems will rely on their protection by minerals, 
solubility in water and microbial degradation rather than primarily their 
chemical properties. 

Aquatic carbon is not only important as part of the global carbon cycle, 
but also for local biogeochemical processes in streams and lakes. The 
observation of electron shuttling by ‘humic substances’ may serve as an 
example’’-”’. Electron shuttling is often attributed to quinones*” and isa 
key driver for the microbial use of organic carbon, including organic pol- 
lutants and oxidation of reduced metals in oxygen-limited environments 
such as aquatic sediments and peatlands*!. Extracts of ‘humic substances’ 
typically used for investigations of electron shuttling phenomena may 
have developed this capacity not as a result of ‘humification, but because 
alkaline solutions extract quinones that are present in soil as a result of 
known microbial metabolism® or in carbon thermally altered by fire®, 
which has been shown to be electrochemically active***°. Abandoning the 
‘humic proxy will broaden future research to include electron transfer 
mediated by organic matter that is not soluble in alkali. This will improve 
identification of mechanisms controlling methane production in tem- 
porarily anoxic environments” and those elements of biotic®® and abi- 
otic® iron cycles that remain elusive. 

Water treatment is a vital technology, but its mechanistic basis is ren- 
dered questionable by the pervasive use of the ‘humic substances’ proxy. 
Anaerobic bioremediation refers to ‘humic substances’ as an electron 
acceptor’® that removes pollutants. During purification of drinking 
water, on the other hand, ‘humic acids are considered contaminants, 
because reactions with disinfectants generate by-products that are toxic to 
humans*”, Research specifically targets ‘humic’ isolates that are perceived 
to be relevant proxies for organic compounds in waste water®®. Instead, 
water treatment would benefit from using organic materials that are based 
on mixtures of actually existing degradation products rather than the 
proxies based on alkaline extraction, as in the removal of organic mat- 
ter by coagulation®”. Water treatment needs to become more predictable 
because future contamination will inevitably include new pharmaceuti- 
cals or nanoparticles of which we have limited experience. 


Agriculture 

Productive soils are central to human welfare because agriculture gener- 
ates most of our food, feed and fibre. Organic matter contributes to soil 
fertility by retaining plant-available water and nutrients or promoting 
the formation of soil structure, but it is also consumed in the process of 
arable soil management as it releases needed nutrients and energy when 
it decomposes*®. However, proposals to return the carbon lost through 
agricultural activities in previous decades often emphasize the need to 
build or augment a ‘stable humus’ pool, drawing on the outdated con- 
cept of ‘humification. Such a pool has been suggested to increase soil 
organic matter resistance to decomposition through in situ synthesis of 
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macromolecules” or hydrophobic protection by ‘humic substances”. 


However, this goal seems counterproductive given that soil organic mat- 
ter is most beneficial when it decays and releases energy and nutrients”. 
Acknowledging the dynamic continuum of decomposition products 
suggests that the management of soil organic matter turnover is more 
important than the accrual of non-productive organic matter deposits. 
This requires a mechanistic understanding of interactions with minerals, 
movement into areas of lower mineralization and mediation of microbial 
activity’. The need to manage the turnover and volume of organic com- 
pounds and nutrient provisioning to optimize soil productivity (Fig. 1) 
warrants further research into balancing both stocks and flows of organic 
matter. 

Soil organic matter can reduce contaminant uptake into crops and 
leaching into groundwater through adsorption at the cost of long-term 
accumulation. Studying the hypothetical interactions of heavy metals or 
other pollutants with extracts of ‘humic substances’ will provide limited 
insight into contaminant behaviour. Future research into interactions of 
organic matter with arsenic”, other heavy metals®? or pharmaceuticals” 
will generate more robust information by investigating the entire soil 
organic matter or the portion present in soil solution rather than what 
is extractable by alkali. This will allow better predictions of contaminant 
movement and mitigation of their environmental impact by adsorption 
and microbial use. 

Alkaline extraction targets materials with abundant functional groups. 
Consequently, plant growth is often enhanced when such materials are 
added to soils particularly to stimulate rooting’. Alkali-extracted prod- 
ucts are therefore becoming increasingly popular as soil amendments”. 
Better crop nutrition is an important part of this strategy and plant uptake 
of micronutrients is indeed known to be improved when organic com- 
pounds make them more soluble”. Positive plant responses to ‘humic 
substances’ resembling those of beneficial plant hormones”, through 
improved defence mechanisms against pests or diseases”® and changes 
in gene expression” may mean that the alkaline extracts contain com- 
pounds that trigger these effects. If we acknowledge soil organic matter as 
a continuum of decomposition products, we will be better able to design 
soil applications for specific purposes such as improved plant defence, and 
unpack what is essentially a ‘black box’ of compounds extracted by alkali. 
Research and product development should therefore focus on organic 
compounds that are soluble in water for managing soil health and focus 
on relationships between specific functional groups or compounds and 
positive plant responses for which information already exists. 


The way forward 
The need for the soil sciences to move away from both the ‘humifica- 
tion model and associated ‘humic’ language has been much debated. 
Unfortunately, this objective has not been implemented with rigour and 
has largely been ignored in the neighbouring fields of aquatic and envi- 
ronmental sciences. In many cases, the ‘humification’ model itself has 
been abandoned, but the ‘humic’ nomenclature is maintained. For exam- 
ple, the large molecular size of ‘humic substances’ has been refuted!3!00 
but not their existence. The issue has also been approached by redefining 
‘humic substances’ as the portion of soil organic matter that cannot be 
molecularly characterized*”!0''™, or by calling all soil organic matter 
‘humus’!!. We argue that this compromise—maintaining terminology 
but altering its meanings in varying ways—hampers scientific progress 
beyond the soil sciences. The SCM of soil organic matter does not allow 
a confusing middle path; it requires leaving the traditional view behind to 
bring about lasting innovation and progress’, This is critical as scientific 
fields outside the soil sciences base their research on the false premise of 
the existence of ‘humic substances. Thus an issue of terminology becomes 
a problem of false inference, with far-reaching implications beyond 
our ability to communicate scientifically accurate soil processes and 
properties. 

Reconciliation of modern experimental evidence with a robust 
molecular model can immediately be achieved by consistently referring 
to ‘humic substances’ as alkaline extracts rather than suggesting that 
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a distinct category of organic materials exists. This is essential when 
modelling global soil carbon, for which we need to cease using soil 
carbon pools whose definitions are rooted in ‘humic theory. In future 
research, alkaline extracts should not be used as proxies for naturally 
occurring organic matter or a subset thereof. Alkaline extraction should 
be supplanted by approaches that capture actual solubility in soil, river 
or ocean water. 

The SCM will direct fundamental research questions towards 
microbial access to ‘protected’ rather than ‘stable’ carbon, and this will 
lead to more mechanistic representations of pollutant mobility and elec- 
tron transfer reactions. In applied science and industry, this shift will 
prove more difficult to establish, because commercial ‘humification 
products and their marketing are strongly established, particularly in 
the gardening and compost industry. However, alkaline extraction does 
indeed isolate organic materials rich in oxygen, which may have value for 
product development. Therefore, we urgently need a biologically based 
explanation of the established growth-promoting effects of some highly 
oxidized organic compounds in soil in order to develop commercial prod- 
ucts that operate in a predictable manner based on observable reactions 
of enzymes, hormones or cell wall transport. This will redirect existing 
research and development programmes at the intersection of molecular 
biology, ecology and soil biogeochemistry to allow the implementation 
of scientifically sound ‘soil health’ concepts. 

Government-funded research programmes must therefore preferen- 
tially support science that bridges the gap between detailed and fine- 
scale mechanistic research at the plant-soil interface and field-scale 
research relevant to those who manage soils for their multiple ecosystem 
services. There are great opportunities for progress in explaining soil 
carbon responses to warming, and in the improvement of soil fertility 
and water quality. Coordinated interdisciplinary research programmes 
should be urgently set up to encourage greater coordination between soil 
biogeochemists and modellers. Such programmes should use the SCM to 
examine the balance between managing carbon and nutrient flows with 
sequestration, and between carbon transport, deposition and evasion in 
rivers and oceans. Models based on pools should be replaced with models 
based on organic matter solubility and spatial architecture to improve 
climate prediction, regional and global assessments of soil resources and 
soil vulnerability. The reward will be more robust forecasts and resource 
evaluation, issues critical for developing future climate change and land 
use policies. 
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Soil biodiversity and human health 


Diana H. Wall', Uffe N. Nielsen2* & Johan Six?* 


Soil biodiversity is increasingly recognized as providing benefits to human health because it can suppress disease-causing 
soil organisms and provide clean air, water and food. Poor land-management practices and environmental change are, 
however, affecting belowground communities globally, and the resulting declines in soil biodiversity reduce and impair 
these benefits. Importantly, current research indicates that soil biodiversity can be maintained and partially restored 
if managed sustainably. Promoting the ecological complexity and robustness of soil biodiversity through improved 
management practices represents an underutilized resource with the ability to improve human health. 


oils comprise a dynamic reservoir of biodiversity within which the 

interactions between microbes, animals and plants provide many 

benefits for human well-being; however, their potential use for the 
maintenance of human health has been less clear’. Living soils are vital 
to humans because soil biodiversity, with its inherent complexity (the 
types, sizes, traits and functions of soil organisms), not only provides 
disease control but also influences the quantity and quality of the food 
we eat, the air we breathe and the water we drink’. The long-term pro- 
vision of these benefits is dependent on careful and sustainable use of 
soils as a resource. Yet, soil biodiversity is often unintentionally affected 
by human-induced global changes. Land-use change, including urban- 
ization, agriculture, deforestation and desertification, can have a ripple 
effect on soils and soil biodiversity that extends far beyond the original site 
of disturbance®®. For example, the increase in soil erosion by water and 


> 


Poor land management 


Deforestation, degradation, 
desertification, urbanization, 
pollution 


Soil biodiversity 


Species richness, functional 
diversity, foodweb structure, 
biotic interactions 


Vv 


Loss of ecosystem functioning 
and service provision 
Water infiltration, regulation of 
pests and pathogens, erosion 
control, nutrient release 


Human health impacts 
Increased risk of diseases 
caused by human pests and 
pathogens, by less nutritious 


wind contributes to the formation of dust storms and the dispersal of soil 
organisms and pathogens, with effects on soil biodiversity and ultimately 
on human, plant and animal health”~!°. 

Research efforts are rapidly producing information about soil 
biodiversity and its functions, which can be combined with land 
managers’ knowledge, to inform the development of sustainable soil- 
management practices!!!-!°, The resulting global preservation and 
restoration of soils would provide an additional path towards decreasing 
disease in and providing medicine for humans, plants and animals. 

Here, we focus on the impacts of the use and mismanagement of 
land on human health due to (1) changes in the prevalence of antago- 
nists for soil-borne pests and pathogens that cause diseases in humans, 
plants and animals, and (2) changes in soil biodiversity that affect the 
maintenance of health (Fig. 1). We use the integrated concept of human 


« 


Climate change 


Precipitation, 
temperature, and 
extreme events 


food, and by lack of clean 
water and air 


Figure 1 | Flow diagram illustrating the link between soil biodiversity 
and human health. Soil biodiversity is often negatively affected by the 
interaction between poor land management practices and drivers of 
climate change, both of which ultimately compromise ecosystem function 


and services that are essential for human health (control of pests and 
pathogens, production of nutritious food, cleansing water and reducing 
air pollution). Responses to reduced human health can in turn affect 
management decisions that govern land use and climate change. 
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Table 1 | Soil pathogens and parasites of humans, animals and plants 


Euedaphic pathogens 


Soil-transmitted pathogens 


Host organism Type of pathogen Species name 


Disease caused 


Species name Disease caused 


Humans Bacteria Bacillus anthracis Anthrax Escherichia coli Diarrhoea 
Listeria monocytogenes Listeriosis Salmonella spp. Salmonellosis, Typhoid fever, 
Diarrhoea 
Fungi Aspergillus spp. Aspergillosis 
Coccidioides immitis Valley fever 
Histoplasma capsulatum Histoplasmosis 
Protozoa Naegleria fowleri Brain encephalitis Toxoplasma gondii Toxoplasmosis 
Helminths (Nematoda) Ascaris lumbricoides Ascariasis 
Ancylostoma duodenale Hookworm 
Necator americanus Hookworm 
Strongyloides stercoralis Strongyloidiasis 
Platyhelminthes Taenia saginata Beef tapeworm 
Animals Bacteria Bacillus anthracis Anthrax 
Helminths (Nematoda) Haemonchus contortus Haemonchosis 
Plants Bacteria Agrobacterium tumefaciens Crown gall 
Fungi Phytophthora infestans Potato blight 
Helminths (Nematoda) Meloidogyne spp. Root knot 
Bursaphelenchus xylophilus Pine wood 


Following refs 19 and 89, pathogens are listed as euedaphic (true soil organisms) or as soil-transmitted (those temporarily living in soil and transmitted to a host). 


health as defined by The World Health Organization and Convention 
on Biological Diversity'*, which extends beyond disease and infirmity 
and recognizes human connections to other species, ecosystems and the 
ecological foundation of varied drivers and protectors of human health. 
We specifically discuss how knowledge of the linkages between soil 
biodiversity and human health can be strengthened for improved 
management of land. Some aspects of land-use change in relation to soil 
biodiversity and human health are covered elsewhere, including indus- 
trial pollution, radioactivity, landfills, resource extraction, and mineral 
toxicity; and are not included here!>", 


Soil biodiversity and soil-borne pathogens 

Most soil organisms pose no risk to human health; rather, evidence is 
accumulating that soil biodiversity can be of great benefit!”"®. Soil-borne 
pathogens and parasites that cause human diseases represent a minority 
of the species living in soils. There is a great opportunity to capitalize 
on the positive effects of soil organisms on human health through their 
roles (direct and indirect) in controlling soil-borne pathogens and pests 
(listed in Table 1). 

Many animal, plant and human disease-causing organisms or their 
vectors live in soil, but their relationship to human diseases and the envi- 
ronment is not fully elucidated'*!?-*!, To address soil management and 
public health we need an understanding of the organisms, their ecological 
interactions, and why they become prevalent or persistent in soils’””*. 
Some soil-borne pathogens, such as the bacterial genera Pseudomonas 
and Enterobacter, are opportunistic species that can infect and cause dis- 
eases in humans but whose main functions in the soil foodweb are as 
antagonists against plant root pathogens, promoters of plant growth and 
decomposers””*, Other soil-borne pathogens are obligate parasites that 
require a host to complete their life cycle. Most of these organisms can 
survive in soils for weeks to years, including as spores and eggs or inside 
carcasses. Soil-borne pathogens causing human infectious disease can 
be either true inhabitants of soils (euedaphic) or are transmitted via soils 
(Table 1). Soil-transmitted pathogens are usually obligate pathogens and 
reside temporarily in soil before being transmitted to humans by contact, 
vectors or in faeces. 


Soil and anthrax 

Anthrax is a zoonotic disease infecting humans, wildlife and livestock 
caused by the bacterium Bacillus anthracis. Known in the USA as an 
agent of bioterrorism, B. anthracis is relatively common and found in 
soils worldwide, including within the USA. Anthrax spores can remain 
dormant in soils for decades, but with heavy rains they are brought to the 
soil surface and attach to roots and grasses, which are grazed by animals. 
There have been outbreaks in eastern Colorado and Texas, occasionally 
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resulting in die-offs of grazing animals, usually cattle. In contrast to this 
episodic occurrence, B. anthracis in Namibia and east Africa occurs 
annually in zebras and other grazing animals. In recent field experiments 
in Namibia, carcasses of animals infected with B. anthracis were shown 
to promote grass growth, which thus made the site more attractive for 
grazing wildlife’. This mechanism provides B. anthracis with a wildlife 
host and continues the cycle of the infectious disease. 

Hence, one of the more effective strategies to reduce anthrax preva- 
lence is burning of the vegetation, especially at sites of carcass deposition. 
However, maintaining soil cover is also important to reduce dust forma- 
tion by wind erosion because human infections of anthrax typically result 
from inhalation of airborne spores or via vectors. 

In general, soils favourable for anthrax are calcium-rich with neutral 
to alkaline pH (that is, Chernozem soils)”°. Studies on soils and anthrax 
disease ecology in the Kruger National Park in South Africa found that 
when soil calcium was >150 milliequivalents per gram and pH >7, the 
anthrax death rate for ungulates was seven times higher than in other 
nearby soils”®, 


Soil and helminths 

The nematode Strongyloides, a soil-transmitted helminth and a parasite 
of humans and animals (Table 1), has a unique life cycle that alternates 
between free-living in soil and parasitic. The larvae are passed into soil 
in faeces and moult either (1) to become larvae that can infect humans or 
(2) to develop into adults that produce eggs and become a new free-living 
generation in soil. The free-living form feeds on bacteria as part of the 
soil foodweb, but its role in decomposition and nutrient cycling is not 
well understood. When infective larvae in soil come into contact with a 
suitable host, they penetrate the skin and eventually migrate to the intes- 
tine, where they reproduce. Strongyloides stercoralis infections occur in 
10% to 40% of the human population in many tropical and subtropical 
countries’, as a result of poor sanitation practices. In a study in rural 
Cambodia, about 45% of the people tested were infected, and a higher 
risk of infection was associated with lower organic carbon content of soils 
and land-use conversion from forest to cropland”. 

This strongly suggests that increasing the soil organic carbon levels 
in our croplands could be effective in reducing the prevalence of 
disease-causing helminths. Also included in the category of soil- 
transmitted helminths are hookworms and roundworms (Table 1), 
which infect many people globally. For example, in 2003, China and 
sub-Saharan Africa each had an estimated 200 million hookworm infec- 
tions”. The contributions of hookworms and roundworms to the soil food- 
web and their relationship to soil properties, however, are not well known. 

To understand and predict the incidence of soil-borne pathogens and 
parasites in the future, much can be gained by integrative studies on their 
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BOX | 
Human influence on soil foodwebs 


Soil foodwebs are naturally complex entities that enhance 
ecosystem functions such as biogeochemical cycling and the 
suppression of pests and pathogens for plants, animals and humans. 
Global changes, however, can have detrimental impacts on soil 
foodweb complexity by reducing belowground biodiversity and 
affecting biotic interactions. Substantial advances have been made 
in understanding these impacts on ecosystem functioning over the 
past few decades! *®°, Furthermore, recent studies suggest that 

soil foodweb complexity is essential to maintaining high rates of 
ecosystem function*>468°, Thus, activities that cause belowground 
biodiversity losses (Such as loss of taxa and trophic levels) contribute 
to a reduction in foodweb complexity and, thereby, the capacity of 
soils to perform ecosystem functions. See Box 1 Figure, in which 


‘Complex’ foodweb 


Predators 


taxa are shown as coloured circles, trophic levels are shown as 
boxes, solid lines represent food sources and dashed lines indicate 
omnivory. The ‘simple’ foodweb on the right has been adversely 
affected by human-induced changes. Although these functions 

may not be lost completely, even reduced levels of functioning 

can influence human health directly (by reduced suppression of 
soil-borne diseases) or indirectly (by reduced provision of food, 
clean water and air). In some cases, these functions can be replaced 
through human interventions such as increased fertilizer and 
pesticide inputs, but promoting ecosystem functioning by managing 
soil biodiversity is likely to be more cost-effective and will ensure 
long-term sustainability. 


‘Simple’ foodweb 


Predators 


Global changes 
: Agricultural intensification 


Lost or 


Carbon 
source 


Box 1 Figure| Key features in the reduction of species in soil food webs due to human influence and the adverse effect on ecosystem functioning. 


impaired 


functioning 


Consumers : 


Carbon 
source 


life cycles in soils, their role in soil foodwebs, and how they are affected 
by environmental variables. The resulting knowledge can be used to 
establish viable management options to reduce the impacts of soil-borne 
pathogens and parasites. Integrating this new soil-based knowledge with 
the experience of public health researchers would provide an enormous 
opportunity for new soil-based approaches and policies to control current 
and emerging infectious diseases. 


Soil and allergies 

Several studies have shown that exposure to soil microorganisms lessens 
the prevalence of allergic diseases**°°-*. In particular, there is evidence 
that our immune system needs to be exposed to possible pathogens resid- 
ing in soils in order to develop tolerance*4. For example, it was found that 
individuals living in more urban environments have a lower diversity 
of bacteria on their skin and lower immunity expression***». It is pre- 
dicted'*** that nearly two-thirds of the global human population will be 
living in urban areas by 2050 (refs 14, 33), resulting in less stimulation 
of our immune systems by soil organisms, and leading to more allergic 
diseases. Management of urban areas could easily consider access to natural 
areas and small livestock (chickens, ducks, rabbits and goats) as a way of 
exposing the urban population to soil organisms. 


Soil, antibiotics and antihelminth resistance 

As soils are altered through global change and associated losses in bio- 
diversity above- and belowground, there is concern that we are losing 
a possible source of antibiotics and medicines, as well as the biological 


controls needed to prevent human, animal and plant disease. Antibiotic 
resistance to microbial-derived medicines has increased rapidly, threat- 
ening the prevention and treatment of diseases caused by bacteria, fungi 
and parasites'’. The development of new antibiotics using soil has been 
very slow, because about 99% of bacteria have yet to be cultured. However, 
a new technique recently identified an antibiotic from an uncultured soil 
bacterium that can kill Mycobacterium tuberculosis, the causal agent of 
tuberculosis’’. This is very promising: other as-yet-uncultured species 
may also reveal novel antibiotics. Helminth parasitic worm infections in 
humans, cattle and other domestic animals are often treated with anti- 
parasitic, antihelminthic drugs, which are showing increasing resistance. 
Fortunately, land-management practices, including rotating pastures 
with more-resistant animals, breaking up or removing manure piles in 
pastures, or managing for higher grass growth so that animals do not 
graze on the parasites found in soils, are useful options for reducing the 
risk of parasite infection**°”, 


Soil and biological control 

Human health is influenced indirectly by our choice of agricultural 
management practices owing to changes in the nutritional value of the 
plants and animals we eat, and the quantity of food produced. Plants 
are subject to many diseases caused by bacteria, fungi, viruses and 
parasites, which affect plant growth, nutrient levels and the quality of 
our food. In agriculture, biocontrol of a soil-borne pest for plants is a 
management option that is based on the identification and ecology of 
a naturally occurring soil predator or parasite that reduces the pest or 
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Figure 2 | A conceptual framework illustrating 
how decisions on land use and management 
are linked to human health through the effect 
on soil biodiversity. Soil biodiversity is strongly 
influenced by external drivers such as climate 
change and nitrogen deposition but also by land- 
use management. Land use such as agricultural 
intensification (left) can reduce the diversity and 
densities of beneficial organisms that control pests 
i and pathogens, thereby negatively affecting the 
health of plants, animals and humans. Adopting 
less-intensive management practices (right) that 
enhance soil biodiversity can promote plant, 
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plant pathogen population and thereby enhances food quantity and 
nutrient content. 

For example, the root weevil Diaprepes abbreviatus causes substan- 
tial damage to citrus plants. The damage is naturally controlled in some 
Florida soil habitats by species of indigenous soil entomopathogenic 
nematodes (EPN) that parasitize and kill the root weevil. A nematode 
species not native to Florida, Steinernema riobrave, is commercially avail- 
able to control the root weevil in habitats where indigenous EPNs are less 
dominant, rather than using less effective, more costly chemical controls 
that move into groundwater and are harmful to human health*®*”. In 
addition to augmenting soils with EPNs, managing soil in ways that pro- 
mote native EPN prevalence and diversity are used. For example, main- 
taining good soil drainage, low pH, and adding sand to soils that are 
low in EPNs before planting young trees are practices used to conserve 
native EPN species in citrus groves. There are many other examples of 
biocontrol delivered through a healthy soil foodweb, such as the use of the 
bacterium Pasteuria penetrans, a pathogen of plant parasitic nematodes, 
and Arthrobotrys anchonia, a nematode-trapping fungus that kills plant 
parasitic nematodes*-. 

As recent evidence suggests, it is not only single organisms that should 
be considered as valuable for controlling soil-borne pests and pathogens 
of humans, animals or plants. The immense diversity and abundance 
of organisms found belowground in concert contribute to the control 
of pests and pathogens (Box 1)***°. Hence, control of soil-borne path- 
ogens should not focus solely on specific beneficial soil predators or 
parasites, but rather on how a general increase in the complexity of soil 
biodiversity can reduce plant, animal and human diseases caused by 
soil-borne pathogens: the disease is suppressed as a result of the whole 
soil foodweb*”**, 

Although the direct link between biodiversity and disease suppression 
has not been well established in soil owing to the complex interactions 
that occur belowground, there is growing evidence for the aboveground 
world: disease risk in wildlife plants and humans rises with biodiversity 
loss*?-°?, For example, Johnson et al.°° recently showed in wetland 
ecosystems that amphibian species richness moderated pathogen transfer 
and thereby limited disease prevalence in the animals. Soil biodiver- 
sity may similarly moderate the impacts of pests and pathogens both 
above- and belowground. One recent study™ showed how increased 
microbial diversity reduced the success of a bacterial pathogen in vitro. 
Furthermore, temporary inhabitants of soil can also have positive 
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functions in ecosystems and thereby indirectly benefit human health; 
for example, some bumblebees well known for their benefits as plant pol- 
linators live temporarily in soils. Burrowing vertebrates such as voles and 
prairie dogs can also indirectly benefit plant, animal and human health 
by mixing and enriching organic matter and nutrients in soils°*°. These 
examples emphasize that many soil organisms contribute indirectly to 
soil functions that can ultimately benefit human health. 


Maintaining soil biodiversity for health 

To maintain soil biodiversity, it is essential to take into account the 
spatial distribution of belowground organisms. In recent years, the 
available information on biogeography of soil biodiversity has accelerated. 
Global distributions of soil taxa from microbes to larger animals shows 
that few species occur in all soils; instead many species are rare and show 
restricted distributions, often limited to particular soil types or geograph- 
ical regions°”**. For example, some enchytraeid species occur primarily in 
rich Arctic peat soils, and many nematode and mite species are endemic 
to the Antarctic continent*®. Likewise, a study of the soils of Central Park 
in New York City found almost as many distinct microbial communities 
and undescribed soil biodiversity (bacteria, archaea and eukarya) as occur 
in other global biomes”. Soil biodiversity, like soils themselves, is highly 
variable across fields and regions, highlighting the need to understand 
how soil communities organized in complex soil foodwebs differ spatially 
across a region and globally?7°80*", 

In the next sections, we outline how soil biodiversity influences the 
production of food, fibre and biomass, and the provision of clean water 
and air, and illustrate how improved management of soil biodiversity can 
reverse to some degree the negative impact of humans on the depletion 
of global resources (Fig. 2). 


Food, fibre and biomass production 

With the exception of hydroponic horticulture, all terrestrial crop produc- 
tion is soil-based?". Given that crops support most of the human popula- 
tion, sustainable use of our soils is essential for long-term human health. 
In agricultural systems, soil-borne pathogens can disrupt the metabolic 
flow of nutrients within plants, reduce plant above- and belowground 
biomass, including fruits and other edible plant parts, or even kill the 
plant entirely, all leading to the production of less nutritious food. In 
humid and cooler climates, earthworms have been shown to increase 
crop productivity”, whereas termites can increase yields in warmer and 
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drier climates®’. Bender and van der Heijden®™ showed that enriched soil 
life increased nutrient-use efficiency, plant nutrient uptake and thereby 
crop yields. Moreover, enhancing the soil foodweb structure influences 
the resistance and resilience of other terrestrial ecosystems", and this 
knowledge can be used to promote sustainable use of our soils**°. 

Soil symbionts have a particularly important role in sustainable pro- 
duction. Symbiotic soil microbes are essential for nutrient supply® and 
can contribute to biofortification of plants for important micronutrients 
such as zinc®. Plant breeding and micronutrient fertilization are prom- 
ising ways to address micronutrient deficiencies, but the bioavailability 
of the micronutrients is ultimately determined by soil microbial cycling 
of these micronutrients!®. There is also evidence that fungi, in particular 
endophytes, promote plant stress tolerance”’. 

It is evident from the above that soil biodiversity can play a crucial 
part in providing a more stable supply of food and a higher nutritional 
value of the food produced. However, the intensification of agricultural 
practices in the last century has ignored this role of soil biodiversity. The 
cornerstones of agricultural intensification—ploughing, and the appli- 
cation of agrochemicals and fertilizer—have been linked to a reduction 
of soil biodiversity*®. We stress that these are beneficial practices that 
should not be abolished, but instead should be used at the right time, 
rate and place. 


Air quality 

Land-use change has been tied to the frequency of dust storms, 
emissions of greenhouse gases, and the release of volatile organic com- 
pounds and biota in air’. Soil bacteria, fungi and some invertebrates, such 
as nematodes and mites, are transported several hundreds to thousands 
of kilometres by wind’!””. The misuse of land—such as overly intensive 
ploughing, leaving extended areas bare and fallow, and burning plant bio- 
mass from fields—increases dust and the formation of particulate matter 
of less than 10m in size (PM10), with major consequences for human 
health in the form of respiratory problems, lung tissue damage, and even 
lung cancer”. Because soils are frequently polluted with heavy metals, 
harbour antibiotic-resistant organisms from animal feedlots, and contain 
pathogens for plants, animals and humans, the resulting dust can cause 
negative effects on human health’*”4. 

An example is valley fever in the southwestern region of the USA; 
outbreaks are caused by a soil fungus, Coccidioides immitis, that 
normally decays organic matter and helps to stabilize the soil surface, thus 
minimizing soil erosion. However, when the soil is disturbed, such as by 
agricultural practices, the fungus produces windblown spores that can 
cause lung disease in animals and humans and at worst result in death”>”°. 
In 2004, there were 6,000 cases of valley fever in the USA“. Surveillance 
of dust storms with land-atmosphere modelling and remote sensing of 
dust storms is under way to enhance the epidemiology and decrease the 
number of cases of valley fever”*. 

Here again the link between agricultural intensification, soil biodiver- 
sity, and human health is clear; intensive agriculture disturbs the soil and 
negatively affects soil organisms, such as arbuscular mycorrhizae, sapro- 
trophic fungi, and earthworms, that play a key part in stabilizing soil and 
thereby reduce the potential of dust formation’”’*, Management options 
such as reduced tillage have been shown to reduce PM10 formation”* and 
thus limit the risk of lung disease, cardiac arrhythmia, heart attacks and 
premature death”. Other ways to reduce dust and conserve soil stabil- 
ity and biodiversity include agroecological management practices, such 
as planting windbreaks, adding manure, incorporating cover crops and 
retaining crop residues*”. 


Water quality 

The provision of clean drinking water is increasingly compromised by 
pollution (such as from mining, landfills and agrochemicals)*! and poor 
sanitation (contaminating drinking water with faecal-associated organ- 
isms)'°**, Moreover, land-use changes, especially those accompanying 
urbanization, affect the relationship between runoff versus infiltration of 
water with potential impacts on local surface water bodies, groundwater 
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BOX 2 
The UN Sustainable Development 
Goals 


The successful implementation of the UN Sustainable 
Development Goals”? in September 2015 that are aimed at 
ending poverty and improving the lives of the poor, are tightly 
connected to maintaining the biodiversity of soils. Yet only four 
of the seventeen targets specifically mention soil: Goal 2 (to end 
hunger, achieve food security and improved nutrition, and promote 
sustainable agriculture (Target 2.4), Goal 3 (to ensure healthy lives 
and promote well-being for all at all ages (Target 3.9), Goal 12 (to 
ensure sustainable consumption and production patterns (Target 
12.4), and Goal 15 (to protect, restore and promote sustainable 
use of terrestrial ecosystems, sustainably manage forests, combat 
desertification and halt and reverse land degradation, and halt 
biodiversity loss (Target 15.3). 

For example, Goal 3 (Target 3.9) focuses on substantially 
reducing hazardous chemicals and air, water and soil pollution. 
However, the connection between managing land for enhanced soil 
biodiversity and meeting Sustainable Development Goals such as 
ending epidemics of tropical and other communicable diseases 
(Goal 3, Target 3.3) or sustainable management of water and 
sanitation (Goal 6) is not recognized or incorporated. 

To achieve the Sustainable Development Goals we stress that 
itis not enough to aim towards improvement of a single benefit 
related to ‘food’ or ‘air’ or ‘water’ or ‘disease’ control, because all are 
simultaneously dependent on soils and soil biodiversity. We propose 
a multiple-benefit focus for sustaining soils, biodiversity and global 
health that addresses many of the Sustainable Development 
Goals—such as Goals 1, 2, 3, 6, 8, 11 13, 14 and 15—through the 
following means: 

* Include soil biodiversity and human, plant and animal health 
experts in integrated collaborative research, management and 
policy efforts to sustainably manage soils, food, water and air 
for improving human health 

* Develop a global database of soil biodiversity to facilitate 

integrated and predictive use by scientists and health experts 

* Establish a global archive of samples for the future benefits of 

health researchers that captures the interactions of total soil 

biodiversity (bacteria, archaea, eukaryotes) 

* Utilize existing, new and local knowledge on successful 

management of lands to promote new options for long- 

term maintenance and conservation of soil biodiversity and 

improving human health 

«Include soil biodiversity as a criterion for determining 

wilderness and protected areas 

* Focus research on conservation of soil biodiversity as a 

management tool to improve human health in the long-term 

* Coordinate scientific societies and other global efforts to 

educate and communicate results to land and water managers, 

public and policy makers, such as through the Global Soil 
Biodiversity Initiative (https://globalsoilbiodiversity.org), global 
conventions and scientific societies 

* Broaden the disciplines of human health and soil biodiversity 
linkages to include the combined expertise needed to address 
the multifaceted climate and global environmental changes 
and to meet the Sustainable Development Goals 


levels, areas downstream of point source pollution and the recharge of 
aquifers. 

Soil biodiversity acts to enhance the structure of soils and thereby infil- 
tration and percolation of water through the soil profile to (1) improve 
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water-use efficiency by crops, (2) limit the amount of agricultural runoff 
and associated contamination into adjacent land areas, and (3) filter out 
pathogens and contaminants by size exclusion, die-off and adsorption. 
Soil organisms can also degrade harmful pollutants and reduce the impact 
of poor sanitation®?**, For example, Enterobacter cloacae, an enteric bac- 
terium found in soils and water, is an effective means of bioremediating 
selenium-contaminated agricultural drainage water®”. Selenium, an essen- 
tial micronutrient for humans, occurs in groundwater and can accumulate 
in irrigated river basins and evaporative ponds. Implementing additional 
measures such as reduced irrigation, sealing earthen irrigation canals, 
and rotational land fallowing can further enhance the management of 


excess selenium®’. 


Outlook 

It is clear that soil biodiversity represents an underutilized resource for 
sustaining or improving human health through better soil management. 
As indicated above, some agroecological management options are known 
to maintain and increase soil biodiversity for human, animal and plant 
health. However, further development of viable practices and especially 
the promotion of their use as broadly as possible is urgently needed. 

How to best manage the world’s lands for improved human health? 
Some basic guidelines for management of soil biodiversity are offered 
here. We suggest that a new approach for land use and management is 
required that acknowledges that soil biota act in concert to provide mul- 
tiple benefits, even if these benefits are not easily observed. Moreover, 
increased soil foodweb complexity promotes resistance and resilience to 
perturbation and may buffer the impacts of extreme events. 

Agroecological practices that enhance soil organic matter content 
and soil biodiversity can promote nutrient supply, water infiltration and 
well-structured soil. Effective management options for cropping systems 
include reduced tillage with residue retention and rotation, cover crop 
inclusion, integrated pest management, and integrated soil fertility man- 
agement (such as the combination of chemical and organic fertilizer). 
Expanding plant species diversity in crop and/or land rotations and add- 
ing organic amendments to pastures can increase soil biodiversity and 
mimic better the natural soil foodweb®*®, Additionally, maintenance 
of soil biodiversity at the landscape level can be enhanced through buffer 
strips and riparian zones and land rotations. Drainage water manage- 
ment can reduce the movement of pollutants, agrochemicals and other 
contaminants to nearby landscapes!’. Likewise, several forestry practices 
exist that promote soil biodiversity: re-established mixed deciduous forest 
stands in Europe were shown to have higher soil biodiversity than pure 
coniferous stands*”. 

Management for conservation of land should include soil biodiver- 
sity as an important criterion in determining protected and wilderness 
areas, particularly in rapidly changing ecosystems, such as tropical 
forests, permafrost soils and alpine grasslands. Conservation of soil 
biodiversity should, in general terms, be based on existing knowledge 
of soil properties, the abundance, sizes and types of soil organisms, 
and vegetation. Nevertheless, conserving soil biodiversity could also 
be done through laboratory isolation of individual organisms or whole 
communities to maintain a reservoir of genetic and functional diversity 
appropriate for future disease prevention, biological technologies, and 
pharmaceuticals®®. 

Soil archives that conserve live collections of interacting species of 
soil microbes and invertebrates in soil samples from different biomes are 
irreplaceable and essential; yet at present there are few such archives®®. 
Given the growing global demands placed on limited productive land 
and the projected increases in infectious diseases, there is an urgent need 
to implement these and other conservation measures as a stockpile for 
the future. 

Ideally, the practices and conservation strategies outlined above 
that enhance soil biodiversity for the maintenance of human health 
should be incorporated directly into land-, air- and water-use policies 
at global and regional levels and integrated with public health organi- 
zations such as the United Nations (UN) World Health Organization. 
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Global conventions such as the UN Framework Convention on Climate 
Change, the UN Convention on Biological Diversity (CBD) and the 
UN Convention to Combat Desertification are all central to soils and 
global land use but often neglect soil biodiversity and our dependence 
on soil for human health, with the exception of the CBD!4 through the 
Food and Agricultural Organization (FAO). Through the Global Soil 
Partnership, the UN FAO brings together global institutions and other 
interested parties to coordinate agreements and international challenges 
related to soil sustainability. The Global Soil Partnership is advised on 
global soil issues by a scientific Intergovernmental Technical Panel on 
Soils. Likewise, progress towards the UN Sustainable Development 
Goals can be achieved by incorporating knowledge of soil biodiversity 
into a broader spectrum of benefits that improve human health (see 
Box 2; ref. 89). Importantly, the Global Soil Biodiversity Initiative was 
established as an independent scientific effort to provide information 
on soil biodiversity to policymakers and is preparing to publish the 
first Global Soil Biodiversity Atlas in collaboration with the European 
Union Joint Research Centre. The Global Soil Biodiversity Initiative 
(https://globalsoilbiodiversity.org) is also working to have soil bio- 
diversity considered in current international initiatives such as the 
Intergovernmental Platform on Biodiversity and Ecosystem Services and 
Future Earth. 

Fortunately, there is increased recognition that developing effective 
management tools for soil biodiversity requires active information 
transfer between scientists and policymakers with new policies formed 
on current evidence-based knowledge and local cultural knowledge*™. 
However, we need to identify implementation mechanisms to encour- 
age easier updates on best management practices and related policies to 
ensure long-term sustainable use of global lands under a changing global 
environment. This is particularly crucial given the rapid accumulation of 
new insights on how soil biodiversity can be managed to promote human 
health. 

Weare losing soils and soil biodiversity at a rapid pace, with substantial 
negative ramifications on human health worldwide. It is time to recognize 
and manage soil biodiversity as an underutilized resource for achieving 
long-term sustainability goals related to global human health, not only for 
improving soils, food security, disease control, water and air quality, but 
because biodiversity in soils is connected to all life and provides a broader, 
fundamental ecological foundation for working with other disciplines to 
improve human health. 
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Measuring entanglement entropy ina 
quantum many-body system 


Rajibul Islam!, Ruichao Mal, Philipp M. Preiss!, M. Eric Tai!, Alexander Lukin!, Matthew Rispoli! & Markus Greiner! 


Entanglement is one of the most intriguing features of quantum mechanics. It describes non-local correlations between 
quantum objects, and is at the heart of quantum information sciences. Entanglement is now being studied in diverse 
fields ranging from condensed matter to quantum gravity. However, measuring entanglement remains a challenge. 
This is especially so in systems of interacting delocalized particles, for which a direct experimental measurement of 
spatial entanglement has been elusive. Here, we measure entanglement in such a system of itinerant particles using 
quantum interference of many-body twins. Making use of our single-site-resolved control of ultracold bosonic atoms 
in optical lattices, we prepare two identical copies of a many-body state and interfere them. This enables us to directly 
measure quantum purity, Rényi entanglement entropy, and mutual information. These experiments pave the way for 
using entanglement to characterize quantum phases and dynamics of strongly correlated many-body systems. 


Entangled quantum objects! are correlated in ways that reject the 
principle of local realism. In few-level quantum systems, entangled 
states have been investigated extensively as a means of studying the 
foundations of quantum mechanics? and as a resource for quantum 
information applications’. Recently, it was realized that the concept of 
entanglement has broad impact in many areas of quantum many-body 
physics, ranging from condensed matter‘ to high-energy field theory” 
and quantum gravity®. In this general context, entanglement is most 
often quantified by the entropy of entanglement! that arises in a sub- 
system when the information about the remaining system is ignored. 
This entanglement entropy exhibits qualitatively different behaviour 
from that of classical entropy and has been used in theoretical physics 
to probe various properties of many-body systems. In condensed 
matter physics, for example, the scaling behaviour’ of entanglement 
entropy allows phases to be distinguished that cannot be characterized 
by symmetry properties, such as topological states of matter*'° and 
spin liquids!!”. Entanglement entropy can be used to probe quan- 
tum criticality’? and non-equilibrium dynamics’, and to determine 
whether efficient numerical techniques for computing many-body 
physics exist!®, 

Despite the growing importance of entanglement in theoretical 
physics, current condensed matter experiments do not have a direct 
probe with which to detect and measure entanglement. Synthetic 
quantum systems such as cold atoms!”!8, photonic networks”, and 
some microscopic solid state devices”® have unique advantages: in such 
systems control and detection of single particles are possible, they pro- 
vide experimental access to relevant dynamical timescales, and they 
are isolated from the environment. In these systems, specific entan- 
gled states of few qubits, such as the highly entangled Greenberger- 
Horne-Zeilinger (GHZ) state”! have been experimentally created and 
detected using witness operators”. However, entanglement witnesses 
are state specific. For arbitrary states, an exhaustive method of recon- 
structing the entire quantum state by tomography” can be used to 
measure entanglement. This has been accomplished in small systems 
of photonic qubits”* and trapped ion spins”, but there is no known 
way to perform tomography for systems involving itinerant delocal- 
ized particles. With multiple copies of a system, however, one can use 
quantum many-body interference to quantify entanglement even in 
itinerant systems!>?°?7, 


In this work, we take advantage of the precise control and readout 
afforded by our quantum gas microscope”* to prepare and interfere two 
identical copies of a four-site Bose-Hubbard system. This many-body 
quantum interference enables us to measure quantities that are not 
directly accessible in a single system (without tomography), for exam- 
ple, quadratic functions of the density matrix!>?°?”?-*?. Such non- 
linear functions can reveal entanglement". In our system, we directly 
measure the quantum purity, Rényi entanglement entropy, and mutual 
information to probe the entanglement in site occupation numbers. 


Bipartite entanglement 

To detect entanglement in our system, we use a fundamental property 
of entanglement between two subsystems (bipartite entanglement): 
ignoring information about one subsystem results in the other becom- 
ing a classical mixture of pure quantum states. This classical mixture 
in a density matrix p can be quantified by measuring the quantum 
purity, defined as Tr(p”). For a pure quantum state the density matrix 
is a projector and Tr(p”) = 1, whereas for a mixed state Tr(p”) < 1. 
In the case of a product state, the subsystems A and B of a many-body 
system AB described by a separable wavefunction | wax) (Fig. 1) 
are individually pure as well, that is, Tr(p,) =Tr(p,) =Tr(p,,) = 1. 
Here the reduced density matrix of A is p, = Trp(pag), where 
Pas=|Was)(Wap| is the density matrix of the full system. Trg indicates 
tracing over or ignoring all information about the subsystem B. For an 
entangled state, the subsystems become less pure compared to the full 
system as the correlations between A and B are ignored in the reduced 
density matrix, Tr(p,) = Tr(p;) < Tr(p;,) = 1. Even if the many-body 


state is mixed (Tr( Ps) <1), it is still possible to measure entanglement 
between the subsystems'. It is sufficient** to prove this entanglement by 
showing that the subsystems are less pure than the full system, that is: 


Tr(x) <Tr (Pp) 
Tr(pg) <Tr(y) (1) 


These inequalities provide a powerful tool with which to detect entan- 
glement in the presence of experimental imperfections. Furthermore, 
quantitative bounds on the entanglement present in a mixed many- 
body state can be obtained from these state purities**. 


1Department of Physics, Harvard University, Cambridge, Massachusetts 02138, USA. 
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Figure 1 | Bipartite entanglement and partial measurements. 

A generic pure quantum many-body state has quantum correlations 
(shown as arrows) between different parts. If the system is divided into 
two subsystems A and B, the subsystems will be bipartite entangled 

with each other when there are quantum correlations between them 

(right column). Only when there is no bipartite entanglement present, 

the partitioned system |?) can be described as a product of subsystem 
states |) and | wp) (left column). A path for measuring the bipartite 
entanglement emerges from the concept of partial measurements: 
ignoring all information about subsystem B (indicated as “Trace’) will put 
subsystem A into a statistical mixture, to a degree given by the amount of 
bipartite entanglement present. Finding ways of measuring the many-body 
quantum state purity of the system and comparing that of its subsystems 
would then enable measurements of entanglement. For an entangled state, 
the subsystems will have less purity than the full system. 


Equation (1) can be framed in terms of entropic quantities)*. 


A particularly useful and well studied quantity is the nth-order Rényi 
entropy: 


1 n 

S,(A) T= log Tr(py') (2) 
From equation (2), we see that the second-order (n = 2) Rényi entropy 
and purity are related by $,(A) = — log Tr(px)- S2(A) provides a lower 
bound? for the von Neumann entanglement entropy Syn(A) = S;(A) 
= —Tr(palogpa), which has been extensively studied theoretically. The 
Rényi entropies are rapidly gaining importance in theoretical con- 
densed matter physics because they can be used to extract information 
about the “entanglement spectrum”*>, thus providing more complete 
knowledge about the quantum state than just the von Neuman entropy. 
In terms of the second-order Rényi entropy, the conditions sufficient 
to demonstrate entanglement! 3 become S2(A) > S2(AB), and 
So(B) > S2(AB), that is, the subsystems have more entropy than the full 
system. These entropic inequalities are more powerful in detecting 
certain entangled states than other inequalities such as the 
Clauser-Horne-Shimony-Holt (CHSH) inequality****. 


Measurement of quantum purity 

The quantum purity and hence the second-order Rényi entropy can be 
directly measured by interfering two identical and independent copies 
of the quantum state on a 50%-50% beam splitter!>67°. For two 
identical copies of a bosonic Fock state, the output ports always have 
even particle numbers, as illustrated in Fig. 2a. This is due to the 
destructive interference of all odd outcomes. If the system is composed 
of multiple modes, such as internal spin states or ibis nee sites 
the expectation value of the total number parity P= |], pi is oe to 


unity in the output ports i= 1, 2. Here the parity for mode k is p® +1 
for even or odd numbers of particles, respectively. 

The well known Hong-Ou-Mandel (HOM) interference of two 
identical single photons*® is a special case of this scenario. Here a pair 
of indistinguishable photons incident upon different input ports of a 
50%-50% beam splitter interfere such that both photons always exit 
from the same output port. In general, the average parity measured 
in the many-body bosonic interference on a beam splitter probes the 
quantum state overlap (Supplementary Information) between the two 
copies, {P;) = Tr(p1p2), where p; and p> are the density matrices of 
the two copies respectively and (...) denotes averaging over repeated 
experimental realizations, as shown in Fig. 2b. Hence, for two identical 
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Figure 2 | Measurement of quantum purity with many-body bosonic 
interference of quantum twins. a, When two N-particle bosonic systems 
that are in identical pure quantum states are interfered on a 50%-50% 
beam splitter, they always produce output states with an even number 

of particles in each copy. This is due to the destructive interference of 
odd outcomes and represents a generalized HOM interference, in which 
two identical photons always appear in pairs after interfering on a beam 
splitter. b, If the input states p; and pp are not perfectly identical or not 
perfectly pure, the interference contrast is reduced. In this case the 
expectation value of the parity of particle number (P;) in either output 
(i= 1, 2) measures the quantum state overlap between the two input states. 
For two identical input states ; = 2, the average parity (P;) therefore 
directly measures the quantum purity of the states. We assume only that 
the input states have no relative macroscopic phase relationship. 


systems, that is, for p) = 
(i= 1, 2) equals the quantum purity of the many-body state 


(Pp) =Tr(p”) (3) 


Equation (3) represents the most important theoretical foundation 
behind this work—it connects a quantity depending on quantum 
coherences in the system to a simple observable in the number of par- 
ticles. It holds even without fixed particle number, as long as there 
is no definite phase relationship between the copies (Supplementary 
Information). From equations (1) and (3), detecting entanglement 
in an experiment is thus reduced to simply measuring the average 
particle number parity in the output ports of the multi-mode beam 
splitter. 

We probe entanglement formation in a system of interacting ®’Rb 
atoms on a one-dimensional optical lattice with a lattice constant 
of 680nm. The dynamics of atoms in the lattice is described by the 
Bose-Hubbard Hamiltonian: 


HTD ala 75m 


p2= p, the average parity for both output ports 
15,26,27, 


(n;—1) 


(4) 


where a}, a; and n,;= . are the bosonic creation, annihilation, 
and the number operators at site i, respectively. The atoms tunnel 
between neighbouring lattice sites (indicated by (i, j)) with a rate J and 
experience an onsite repulsive interaction energy U. Planck’s constant 
his set to 1 and hence both J and U are expressed in hertz. The dimen- 
sionless parameter U/J is controlled by the depth of the optical lattice. 
Additionally, we can superimpose an arbitrary optical potential with 
the resolution of a single lattice site by using a spatial light modulator 
as an amplitude hologram through a high-resolution microscope 
(Supplementary Information). This microscope also allows us to image 
the number parity of each lattice site independently”. 


© 2015 Macmillan Publishers Limited. All rights reserved 


To initialize two independent and identical copies of a state with 
fixed particle number N, we start with a low-entropy two-dimensional 
Mott insulator with unity filling in the atomic limit** and determin- 
istically retain a plaquette of 2 x N atoms while removing all others 
(Supplementary Information). This is illustrated in Fig. 3a. The 
plaquette of 2 x N atoms contains two copies (along the y direction) 
of an N-atom one-dimensional system (along the x direction), with 
N=4 in this figure. The desired quantum state is prepared by manip- 
ulating the depth of the optical lattice along x, varying the parameter 
U/J,, where J, is the tunnelling rate along x. A box potential created by 
the spatial light modulator is superimposed onto this optical lattice to 
constrain the dynamics to the sites within each copy. During the state 
preparation, a deep lattice barrier separates the two copies and makes 
them independent of each other. 

The beam splitter operation required for the many-body interference 
is realized in a double-well potential along y. The dynamics of atoms 
in the double well is likewise described by the Bose-Hubbard 
Hamiltonian, equation (4). A single atom, initially localized in one well, 
coherently oscillates between the wells with a Rabi frequency of J= Jy 
(oscillation frequency in the amplitude). At discrete times during this 


evolution, t = ti) — 7 1 with n= 1,2, ..., the atom is delocalized 
be 


equally over the two wells with a fixed phase relationship. Each of these 
times realizes a beam splitter operation, for which the same two wells 
serve as the input ports at time t= 0 and output ports at time t= tn), 
Two indistinguishable atoms with negligible interaction strength 
(U/Jy <1) in this double well will interfere as they tunnel. The dynam- 
ics of two atoms in the double well is demonstrated in Fig. 3b in terms 
of the joint probability P(1, 1) of finding them in separate wells versus 
the normalized time J,t. The joint probability P(1, 1) oscillates at a 
frequency of 772(16) Hz= 4J,, with a contrast of 95(3)%. At 
the beam splitter times, tf = ti) , PU, 1) 0. The first beam splitter 


time, ¢,.= 41) = is used for all the following experiments, with 
BS BS 8], 


P(1, 1) =0.05(2). This is a signature of bosonic interference of two 
indistinguishable particles*”’, akin to the photonic HOM interfer- 
ence®°. This high interference contrast indicates the near-perfect sup- 
pression of classical noise and fluctuations and includes an expected 
0.6% reduction due to finite interaction strength (U/J, ~ 0.3). The 
results from this interference can be interpreted as a measurement of 
the quantum purity of the initial Fock state as measured from the aver- 
age parity (equation (3)), (P;) =1—2 x P(1, 1) =0.90(4), where i= 1, 2 
are the two copies. 


Entanglement in the ground state 
The Bose-Hubbard model provides an interesting system in which to 
investigate entanglement. In optical lattice systems, a lower bound of 
the spatial entanglement has been previously estimated from time-of- 
flight measurements*’ and entanglement dynamics in spin degrees of 
freedom has been investigated with partial state reconstruction”. Here, 
we directly measure entanglement in real space occupational particle 
number in a site-resolved way. In the strongly interacting atomic limit 
of U/J,,>> 1, the ground state is a Mott insulator corresponding to a Fock 
state of one atom at each lattice site. The quantum state has no spatial 
entanglement with respect to any partitioning in this phase—it is in a 
product state of the Fock states. As the interaction strength is reduced 
adiabatically, atoms begin to tunnel across the lattice sites, and ultimately 
the Mott insulator melts into a superfluid with a fixed atom number. The 
delocalization of atoms creates entanglement between spatial subsystems. 
This entanglement originates*!? from correlated fluctuations in the 
number of particles between the subsystems due to the super-selection 
rule that the total particle number in the full system is fixed, as well as 
coherence between various configurations without any such fluctuation. 
To probe the emergence of entanglement, we first prepare the ground 
state of equation (4) in both copies by adiabatically lowering the optical 
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Figure 3 | Many-body interference to probe entanglement in optical 
lattices. a, A high-resolution microscope is used to directly image the 
number parity of ultracold bosonic atoms on each lattice site (in the raw 
images, green represents odd and black represents even). Two adjacent 
one-dimensional lattices are created by combining an optical lattice 

and potentials created by a spatial light modulator. We initialize two 
identical many-body states by filling the potentials from a low-entropy 
two-dimensional Mott insulator. The tunnelling rates J, and J, can be 
tuned independently by changing the depth of the potential. b, The 
atomic beam splitter operation is realized in a tunnel-coupled 
double-well potential. An atom, initially localized in one of the wells, 
delocalizes with equal probability into both the wells by this beam splitter. 
Here, we show the atomic analogue of the HOM interference of two states. 
The joint probability P(1, 1) measures the probability of coincidence 
detection of the atoms in separate wells as a function of normalized 
tunnel time J,f, with the single particle tunnelling J, = 193(4) Hz. 

At the beam splitter duration (J,t= 1/8) bosonic interference leads 

to a nearly vanishing P(1, 1), corresponding to an even parity in the 
output states. This can be interpreted as a measurement of the purity 

of the initial Fock state, here measured to be 0.90(4). The data shown 
here are averaged over two independent double wells. The blue curve 

is a maximum-likelihood fit to the data, and the error bars reflect lo 
statistical error. c, When two copies of a product state, such as the Mott 
insulator in the atomic limit, are interfered on the beam splitter, the 
output states contain even particle numbers globally (full system) as well 
as locally (subsystem), indicating pure states in both. d, On the other 
hand, for two copies of an entangled state, such as a superfluid state, the 
output states contain even particle numbers globally (pure state) but a 
mixture of odd and even outcomes locally (mixed state). This directly 
demonstrates entanglement. 
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lattice potential along x. Then we freeze the tunnelling along x without 
destroying the coherence in the many-body state and apply the beam 
splitter along y. Finally, we rapidly turn on a very deep two-dimensional 
lattice to suppress all tunnelling and detect the atom number parity 
(even = 1, odd = —1) at each site. We construct the parity of a spatial 
region by multiplying the parities of all the sites within that region. 
The average parity over repeated realizations measures the quantum 
purity, both globally and locally, according to equation (3), enabling 
us to determine the second-order Rényi entropy globally and for all 
possible subsystems. 

In the atomic Mott insulator limit (Fig. 3c), the state is separable. 
Hence, the interference signal between two copies should show even 
parity in all subsystems, indicating a pure state with zero entangle- 
ment entropy. Towards the superfluid regime (Fig. 3d), the build-up 
of entanglement between various lattice sites leads to mixed states in 
subsystems, corresponding to a finite entanglement entropy. Hence, 
the measurement outcomes do not have a pre-determined parity. 
Remarkably, the outcomes should still retain even global parity, indi- 
cating a pure global state. Higher entropy in the subsystems than the 
global system cannot be explained classically and demonstrates bipar- 
tite entanglement. 

Experimentally, we find exactly this behaviour for our two 4-site 
Bose-Hubbard systems (Fig. 4). We observe the emergence of spatial 
entanglement as the initial atomic Mott insulator melts into a super- 
fluid. The measured quantum purity of the full system is about 0.6 
across the Mott insulator to superfluid crossover, corresponding to a 
Rényi entropy of S,(AB) = 0.5. The measured purity deep in the super- 
fluid phase is slightly reduced, probably owing to the reduced beam 
splitter fidelity in the presence of increased single-site occupation 
number, and any residual heating. The nearly constant global purity 
indicates a high level of coherence throughout the crossover. For lower 
interaction strength U/J, (superfluid regime), we observe that the sub- 
system Rényi entropy is higher than the full system: S,(A) > S,(AB). 
This demonstrates the presence of spatial entanglement in the super- 
fluid state. In the Mott insulator regime (U/J,,>> 1), S2(A) is lower 
than S,(AB) and proportional to the subsystem size, consistent with a 
product state. 
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Figure 4 | Entanglement in the ground state of the Bose-Hubbard 
model. We study the transition from Mott insulator to superfluid with four 
atoms on four lattice sites in the ground state of the Bose-Hubbard model, 
equation (4). a. As the interaction strength U/J, is adiabatically reduced, 
the purity of the subsystem A (green and blue, inset), Tr(p;), becomes less 
than that of the full system (red). This demonstrates entanglement in the 
superfluid phase, generated by coherent tunnelling of bosons across 

lattice sites. In terms of the second-order Rényi entanglement entropy, 
S,(A) = —log Tr(p2), the full system has less entropy than its subsystems in 
this state. In the Mott insulator phase (U/J,.>> 1) the full system has more 
Rényi entropy (and less purity) than the subsystems, owing to the lack of 
sufficient entanglement and a contribution of classical entropy. 

The circles are data points and the solid lines are theoretical, calculated 
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In these measurements, we post-select outcomes of the experiment 
for which the total number of atoms detected in both copies is even. 
This constitutes about 60% of all the data, and excludes realizations 
with preparation errors, atom loss during the sequence, or detection 
errors (Supplementary Information). The measured purity is consist- 
ent with an imperfect beam splitter operation alone, suggesting much 
higher purity for the many-body state. The measured entropy is thus 
a sum of an extensive classical entropy due to the imperfections of the 
beam splitter and any entanglement entropy. 

Our site-resolved measurement simultaneously provides informa- 
tion about all possible spatial partitionings of the system. Comparing 
the purity of all subsystems with that of the full system enables us to 
determine whether a quantum state has genuine spatial multipar- 
tite entanglement, in which every site is entangled with each other. 
Experimentally, we find that this is indeed the case for small U/J, 
(Fig. 4b). In the superfluid phase, all possible subsystems have more 
entropy than the full system, demonstrating full spatial multipartite 
entanglement between all four sites””“’. In the Mott phase (U/Jx > 1), 
the measured entropy is dominated by extensive classical entropy, 
showing a lack of entanglement. 

By measuring the second-order Rényi entropy we can calculate other 
useful quantities, such as the associated mutual information I,p3 = 
S»(A) + S2(B) — S2(AB). Mutual information exhibits interesting 
scaling properties with respect to the subsystem size, which can be 
key to studying area laws in interacting quantum systems“*. In some 
cases, such as in ‘data hiding states’, mutual information is more 
informative than the more conventional two-point correlators, 
which might take arbitrarily small values in presence of strong cor- 
relations. Mutual information is also immune to extensive classical 
entropy, and hence has practical utility in the experimental study 
of larger systems. In our experiments (Fig. 5a), we find that for the 
Mott insulator state (U/J,,>> 1), the entropy of the full system is the 
sum of the entropies for the subsystems. The mutual information is 
I,p~ 0 for this state, consistent with a product state in the presence 
of extensive classical entropy. At U/J,~ 10, correlations between the 
subsystems begin to grow as the system adiabatically melts into a 
superfluid, resulting in non-zero mutual information, I,p > 0. 
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from exact diagonalization. The only free parameter is an added offset, 
assumed to be proportional to the system size and consistent with the 
average measured entropy (about 0.5) in the full system. The vertical error 
bars in this figure and in Figs 5 and 6 indicate 1¢ in combined statistical 
and systematic errors (Supplementary Information). b, Second-order 
Rényi entropy of all possible bi-partitioning of the system. For small U/J,, 
all subsystems (data points connected by green and blue lines) have more 
entropy than the full system (red circles), indicating full multipartite 
entanglement*’ between the four lattice sites. The residual entropy in the 
Mott insulating regime is from classical entropy in the experiment, and 
extensive in the subsystem size. The right-hand panel in b shows the values 
of all Renyi entropies of the particular case of U/J,~ 1, to demonstrate 
spatial multipartite entanglement in this superfluid. 


© 2015 Macmillan Publishers Limited. All rights reserved 


Mott insulator Adiabatic melt Superfluid 


aD aa 


ARTICLE 


c 
A.B A BA A.B,AB 
<eR> <¥R> <¥RD 
ao | 
1 Superfluid 
1 2 3 


Boundary size 


De) 


Renyi entropy, S, 


Figure 5 | Rényi mutual information in the ground state. Any 
contribution from the extensive classical entropy in our measured Rényi 
entropy can be factored out by constructing the mutual information 
Iyg= S2(A) + S2(B) — S2(AB). a, We plot the summed entropy 

S>(A) + S2(B) (in blue, green and light blue corresponding to the partitions 
shown) and the entropy of the full system S,(AB) (in red) separately. 
Mutual information is the difference between the two, as shown by the 
arrow for a partitioning scheme. In the Mott insulator phase (U/J, > 1) 
the sites are not correlated, and I,p ¥ 0. Correlations start to build up 

for smaller U/J,, resulting in a non-zero mutual information. The theory 
curves are from exact diagonalization, with added offsets consistent with 
the extensive entropy in the Mott insulator phase (about 0.5 for the full 
system). b, Classical and entanglement entropies follow qualitatively 
different scaling laws in a many-body system. The top panel in b shows 
that in the Mott insulator phase classical entropy dominates and S(A) 


It is instructive to investigate the scaling of Rényi entropy and mutual 
information with subsystem size”, since in larger systems they can 
characterize quantum phases, for example by measuring the central 
charge of the underlying quantum field theory*. Figure 5b shows these 
quantities versus the subsystem size for various partitioning schemes 
with a single boundary. For the atomic Mott insulator the Rényi entropy 
increases linearly with the subsystem size and the mutual information 
is zero, consistent with both a product state and classical entropy being 
uncorrelated between various sites. In the superfluid state the measured 
Rényi entropy curves are asymmetric and first increase with the system 
size, then fall again as the subsystem size approaches that of the full 
system. This represents the combination of entanglement entropy and 
the linear classical entropy. The non-monotonicity is a signature of 
the entanglement entropy, as the entropy for a pure state must vanish 
when the subsystem size is zero or the full system. The asymmetry due 
to classical entropy is absent in the mutual information. 

The mutual information between two subsystems comes from the 
correlations across their separating boundary. For a 4-site system, 
the boundary size ranges from one to three for various partitioning 
schemes. Among those schemes with a single boundary, maximum 
mutual information in the superfluid is obtained when the boundary 
divides the system symmetrically (Fig. 5a). Increasing the boundary 
size increases the mutual information, as more correlations are inter- 
rupted by the partitioning (Fig. 5c). 

Mutual information also elucidates the onset of correlations between 
various sites as the few-body system crosses over from a Mott insula- 
tor to a superfluid phase. In the Mott insulator phase (U/J,,>> 1) the 
mutual information between all sites vanish (Fig. 5c, bottom). As the 
particles start to tunnel, only the nearest-neighbour correlations start 
to build up (U/J,,~ 12) and the long-range correlations remain negligi- 
ble. Further into the superfluid phase, the correlations extend beyond 
the nearest neighbour and become long range for smaller U/J,. These 
results suggest disparate spatial behaviour of the mutual information 
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and S,(B) follow a volume law: entropy increases with the size of the 
subsystem. The mutual information I, + 0. The bottom panel in b shows 
the non-monotonic behaviour of S;(A) and S(B) in the superfluid regime, 
due to the dominance of entanglement over classical entropy, which 
makes the curves asymmetric. I,g restores the symmetry by removing the 
classical uncorrelated noise. The solid lines are linear (top) and quadratic 
(bottom) fits included as a guide to the eye. The top panel in c shows that 
more correlations are affected (red arrow) with increasing boundary area, 
leading to a growth of mutual information between subsystems. The data 
points are for various partitioning schemes shown in Fig. 4b. The bottom 
panel in c plots I,g as a function of the distance d between the subsystems 
to show the onset and spread of correlations in space, as the Mott insulator 
adiabatically melts into a superfluid. In these plots some overlapping data 
points are offset from each other horizontally for clarity. 


in the ground state of an uncorrelated (Mott insulator) and a strongly 
correlated phase (superfluid). For larger systems this can be exploited 
to identify quantum phases and the onset of quantum phase transitions. 


Non-equilibrium entanglement dynamics 

Away from the ground state, the non-equilibrium dynamics of a quan- 
tum many-body system is often theoretically intractable. This is due to 
the growth of entanglement beyond the access of numerical techniques, 
such as the time-dependent density matrix renormalization group the- 
ory*©”, Experimental investigation of entanglement may shed valuable 
light onto non-equilibrium quantum dynamics. Towards this goal, we 
study a simple system: two particles oscillating in a double well*”*’. The 
non-equilibrium dynamics are described by the Bose-Hubbard model. 
The quantum state of the system oscillates between unentangled (parti- 
cles localized in separate wells) states and entangled states in the Hilbert 
space spanned by |1, 1), |2, 0) and |0, 2). Here, |m, n) denotes a state 
with m and n atoms in the two subsystems (wells), respectively. Starting 
from the product state |1, 1) the system evolves through the maximally 
entangled states |2, 0) + |0, 2) + |1, 1) and the symmetric, HOM-like 
state |2, 0) + |0, 2). In the maximally entangled states the subsystems 
are completely mixed, with a probability of 1/3 of having zero, one or 
two particles. The system then returns to the initial product state |1, 1) 
before re-entangling. In our experiment, we start with a Mott insulating 
state (U/J, >> 1), and suddenly quench the interaction parameter to a 
low value, U/J,,~ 0.3. The non-equilibrium dynamics is demonstrated 
(Fig. 6) by the oscillation in the second-order Rényi entropy of the sub- 
system, while the full system assumes a constant value originating from 
classical entropy. This experiment also demonstrates entanglement in 
HOM. like interference of two massive particles. 


Summary and outlook 
In this work, we perform a direct measurement of quantum purity, the 
second-order Rényi entanglement entropy, and mutual information 
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Figure 6 | Entanglement dynamics in a quench. Entanglement dynamics 
of two atoms in two sites after a sudden quench of the Hamiltonian from 
a large value of U/J, to U/J, 0.3, with J, 210 Hz. Here, ‘evolution time 
refers to the duration that the atoms spend in the shallow double well, after 
the initial sudden quench. The system oscillates between Mott insulator 
(I) and quenched superfluid regimes (II, II). The growth of bipartite 
entanglement in the superfluid regime is seen by comparing the measured 
Rényi entropy of the single site subsystem (blue data points) to that of the 
two site full system (red data points). The solid lines are the theoretical 
curves, with vertical offsets to include the classical entropy introduced by 
experimental imperfections. 


in a Bose-Hubbard system. Our measurement scheme does not rely 
on full density matrix reconstruction or the use of specialized witness 
operators to detect entanglement. Instead, by preparing and inter- 
fering two identical copies of a many-body quantum state, we probe 
entanglement with the measurement of only a single operator. Our 
experiments represent an important demonstration of the usefulness 
of the many-body interference for the measurement of entanglement. 
It is straightforward to extend the scheme to fermionic systems”? and 
systems with internal degrees of freedom”’, and to two dimensions. 
By generalizing the interference to n copies of the quantum state”’, 
arbitrary observables written as an nth-order polynomial function of 
the density matrix—for example, Rényi entropies of order n > 2—can 
be measured. 

With modest technical upgrades to suppress classical fluctuations 
and residual interactions, it should be possible to further improve the 
beam splitter fidelity, enabling us to work with much larger systems. 
Mutual information may be ideal for exploring larger systems as it is 
insensitive to any residual extensive classical entropy. For high entropy 
of a subsystem, corresponding to low state purity, the number of meas- 
urements required to reach a desired precision is high. However, in 
contrast to tomographic methods, this scheme would not require addi- 
tional operations for larger systems. Moreover, the single-site resolu- 
tion of the microscope allows us to simultaneously obtain information 
about all possible subsystems, to probe multipartite entanglement. 

For non-equilibrium systems, entanglement entropy can grow in 
time (indefinitely in infinite systems). This leads to interesting many- 
body physics, such as thermalization in closed quantum systems”’. The 
long duration of growth of entanglement entropy is considered to be a 
key signature of many-body localized states'* arising in the presence of 
disorder. The ability to measure the quantum purity for these systems 
would allow experimental distinction of quantum fluctuations and 
classical statistical fluctuations. 
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More generally, by starting with two different quantum states in the 
two copies this scheme can be applied to measure the quantum state 
overlap between them. This would provide valuable information about 
the underlying quantum state. For example, the many-body ground 
state is very sensitive to perturbations near a quantum critical point. 
Hence, the overlap between two ground states with slightly different 
parameters (such as U/J in the Bose-Hubbard Hamiltonian) could be 
used as a sensitive probe of quantum criticality°!. Similarly the overlap 
of two copies undergoing non-equilibrium evolution under different 
perturbations can be used to probe temporal correlation functions in 
non-equilibrium quantum dynamics. 
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Pharmacogenomic agreement between 
two cancer cell line data sets 


The Cancer Cell Line Encyclopedia and Genomics of Drug Sensitivity in Cancer Investigators* 


Large cancer cell line collections broadly capture the genomic diversity of human cancers and provide valuable insight 
into anti-cancer drug response. Here we show substantial agreement and biological consilience between drug sensitivity 
measurements and their associated genomic predictors from two publicly available large-scale pharmacogenomics 
resources: The Cancer Cell Line Encyclopedia and the Genomics of Drug Sensitivity in Cancer databases. 


of molecularly characterized cancer cell lines has proved useful 

in assessing the cellular activity of many compounds, assigning 
mechanisms of drug action, and determining genetic contexts for 
distinct cancer vulnerabilities'“®. A recent comparison study’ of the 
Cancer Cell Line Encyclopedia (CCLE)* and the Genomics of Drug 
Sensitivity in Cancer (GDSC)’ reported poor correlations between their 
pharmacological data, thus questioning the validity of some conclusions. 
These observations raised important questions for the field about 
how best to perform comparisons of large-scale data sets, evaluate the 
robustness of such studies, and interpret their analytical outputs. 

To address these questions, we performed a systematic comparison 
of the CCLE and GDSC pharmacological data and drug sensitivity pre- 
dictors. Our results show that when biologically-grounded analytical 
considerations are incorporated, pharmacological data from the CCLE 
and GDSC studies exhibit reasonable consistency. Most importantly, 
these analyses demonstrate that data from either study yields similar 
predictors of drug response. 


p erforming in vitro pharmacological sensitivity studies across panels 


Comparison of cell line pharmacological data sets 

To evaluate the consistency of the pharmacological data from the 
two studies, we first performed a comparative analysis of CCLE and 
GDSC drug screening metrics. For this analysis, we used both the 50% 
inhibitory concentration (ICs) and the area under the curve (AUC; 
also referred to as activity area in CCLE when considering 1—AUC). 
Importantly, ICs9 values were capped at the maximum tested drug con- 
centrations to ensure that they could be properly compared between 
both data sets (Supplementary Data 1). Also, a fixed axis scale was 
applied across all compounds to facilitate visualization (Extended 
Data Fig. 1). Of note, while 471 cell lines are present in both CCLE 
and GDSC collections and have associated genomic data, only a subset 
of those have overlapping drug screening data: a range of 82-256 cell 
lines per compound (median = 94 cell lines; mean = 157; Fig. la and 
Supplementary Data 1). 

Our analytical approach was designed to account for the fact that 
many pharmacological profiles exhibit highly discontinuous distribu- 
tions across cancer cell line collections. Whereas a subset of individ- 
ual lines may show marked pharmacological sensitivity, the remaining 
lines—often the vast majority of cell lines in the collection—may be 
relatively insensitive to a given drug. Such outlier’ distributions are 
expected, as they are typically observed for drugs that target specific 
oncogenic dependencies. Given the relative paucity of sensitive outliers, 
appropriate pharmacological assessments require multiple drug- 
sensitive cell lines for each compound and the ability to discern this 


relevant signal against a background dominated by the insensitive major- 
ity. Additionally, small data sets containing exclusively insensitive lines 
are not expected to display significant correlations given the inherent 
noise in their drug response data. 

In cases where direct GDSC-CCLE comparisons were possible, nearly 
all compounds (13/15) exhibited AUC and ICso distributions domi- 
nated by drug-insensitive lines, with a much smaller number of drug- 
sensitive outliers. The complete distributions of all CCLE and GDSC 
AUC values are illustrated for each compound by “violin plots’, while 
overlapping lines are displayed as a scatter plot (representative exam- 
ples are shown in Fig. 1, and all plots in Extended Data Fig. 1); results 
for ICs values are similar (Extended Data Fig. 1). Ten compounds 
(saracatinib (also known as AZD0530), erlotinib, lapatinib, nilotinib, 
crizotinib, nutlin-3, PD0332991, PHA665752, PLX4720 and sorafenib) 
exhibited AUC values skewed heavily towards the drug-insensitive end 
of the spectrum. Notably, several targeted anticancer drugs had very 
few (if any) drug-sensitive lines in the overlapping set (for example, 2 
for crizotinib, 3 for nilotinib, 2 for TAE684, and zero for erlotinib or 
sorafenib; Fig. 1b, c and Extended Data Fig. 1). This relative paucity of 
drug-sensitive cell lines in the overlapping set constrained the level of 
correlation achievable. 

Nevertheless, a correlation analysis that accounted for the imbalance 
between the number of sensitive and insensitive cell lines, and for dif- 
ferences in the original analytical methodologies, yielded good consist- 
ency in most cases (see Extended Data Fig. 2, comparing Spearman's 
and Pearson's correlations properties in this context, and Supplementary 
Discussion). When using the Pearson correlation coefficient instead 
of Spearman’, as well as consistently capped drug sensitivity metrics, 
correlation values were clearly improved for most drugs compared to 
the earlier comparison study’ (Fig. 1d, e, Methods and Supplementary 
Discussion). We noted that some correlation values remained poor, 
either owing to differences in cell line biology, in actual pharmacolog- 
ical measurements (for example, nutlin-3, paclitaxel and PHA665752), 
or because sensitive lines were only present in one of the cell line collec- 
tions (for example, erlotinib and sorafenib), preventing any meaningful 
comparison (Fig. Ic). 

To complement this correlation analysis, we used a waterfall plot-based 
assessment (Extended Data Fig. 3 shows a schematic of the workflow 
and further details are provided in the Supplementary Discussion). This 
analysis confirmed that on average, 94% of cell lines for the 13 relevant 
compounds (CCLE mean = 94%, range = 77-100%; GDSC mean = 96%, 
range = 86-100%; Supplementary Data 2) clustered within a drug- 
insensitive range (for example, ICs9 values of >1 uM for most com- 
pounds). These waterfall analyses also showed a high consistency of 


*Lists of participants and their affiliations appear at the end of the paper. 
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Figure 1 | Comparison of pharmacological data from the CCLE and 
GDSC studies. a, Overlap of data sets. b, c, Comparison of drug sensitivity 
(AUC) measured in n overlapping cell lines between the studies for drugs 
with good (b) or poor (c) correlation. R, Pearson correlation coefficient; 

P, P value. Violin plots, distribution of sensitivity values for all lines in 
each study. Grey dot, median; black line, interquartile range; shape, kernel 
density of the distribution. d, e, Correlation coefficients between GDSC 


cell line categorization as “sensitive” or “resistant” between CCLE and 
GDSC data (Fig. 1f, Extended Data Fig. 3). This consistency was evi- 
dent even when using a simple drug sensitivity cut-off (1 uM) across 
all the drugs tested (Extended Data Fig. 3). Thus, both categorization 
approaches showed higher consistency than reported in the earlier study” 
(see Supplementary Discussion). These results indicated that the CCLE 
and GDSC cell line pharmacological screening data are best suited for 
modelling studies that distinguish rare, drug-sensitive lines from “all 
others” (for example, from drug-insensitive lines that are not expected 
to contribute meaningful molecular or genetic information). 


Comparison of drug sensitivity predictors 

Next, we considered the extent to which the CCLE and GDSC cell line 
collections illuminated common genetic or molecular underpinnings 
of anticancer drug efficacy. Such insights provide one of the most rele- 
vant measures for concordance and utility of pharmacological screening 
data, given that these efforts are designed to identify such predictors 
of drug response. First, we determined whether molecular correlates 
of drug response were aligned between the two data sets. Here we per- 
formed an analysis of variance (ANOVA) using the overlapping lines 


Haibe-Kains et al. results 


and CCLE data sets. x axis, Spearman, Haibe-Kains et al. 7; y axis, Pearson, 
present analysis. Dot sizes are proportional to the number of overlapping 
cell lines. Dots above the dashed y= x line denote an improved correlation 
compared to Haibe-Kains et al.’. f, Comparisons of Cohen’s Kappa 
coefficient testing studies’ agreement in Haibe-Kains et al.’ (x axis) and the 
present study (y axis) for sensitivity/resistance calling using a waterfall plot 
analysis. 


across the CCLE and GDSC. We considered two models where the 
predicted variables were ICso values or activity area (that is, 1-AUC) 
scores, respectively. In both models we considered the tissue of origin 
as a covariate and the mutational status of 71 oncogenes as independent 
variables. 

ANOVA identified known genetic biomarkers of sensitivity or resist- 
ance as top molecular correlates in at least one data set for 13/15 com- 
pounds, and in both data sets for 8/15 compounds (Fig. 2a, Extended 
Data Fig. 4, Supplementary Data 3). Genetic correlates in both data sets 
included NRAS mutation and sensitivity to MEK inhibitor PD0325901, 
BRAF mutations and sensitivity to BRAF inhibitor PLX4720, the BCR- 
ABL]I fusion gene and sensitivity to multiple ABL1 inhibitors (nilotinib, 
AZD0530) and sensitivity of ERBB2-amplified cells to ERBB2 inhib- 
itor lapatinib (identified when using ICso values; Extended Data 
Fig. 4). Additionally, drug resistance associations such as TP53 mutations 
and resistance to nutlin-3 were recovered consistently using activity area 
scores. When ANOVA was fitted to activity area, 14 drugs for the GDSC 
and 15 for the CCLE also showed lineage-specific response associations 
that were consistent across data sets (post-hoc Welch t-test; Extended 
Data Fig. 5 and Supplementary Data 4 and 7). 
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Figure 2 | Consistency of drug sensitivity prediction markers between 
the CCLE and GDSC data sets. a, ANOVA on overlapping data set 

(1— AUC). Coordinates, ‘signed log q-values. Negative sign, gene 
associated with increased sensitivity; positive, increased resistance. 
Distance from 0, q-value. Fisher's exact test of consistency of marker 
behaviour on all or only significant associations. Markers in grey are 

not significant; markers highlighted are significant in both the studies. 
Corresponding drug name, target(s) and cancer gene are reported for a 


In a more comprehensive assessment of the consistency of genomic 
predictors, we applied a multivariate analysis across 21,013 genomic fea- 
tures encompassing expression, copy number changes and mutations®”. 
Elastic net regression was performed using either the full data set availa- 
ble for each study or only the overlapping data sets. This analysis yielded 
robust response predictors, and the overlap of predictors was highly 
significant (y” P< 10~*; Extended Data Fig. 6, Supplementary Data 5). 
Here again, known genomic predictors of drug response emerged as top 
molecular correlates in at least one data set for 13/15 compounds; 10/15 
compounds showed such correlates in both data sets (Supplementary 
Data 5), as reported previously by CCLE and GDSC using their individ- 
ual data sets*?, For some drugs, extending elastic net regression analyses 
of ICso values beyond just the overlapping cell lines identified additional 
genetic predictors of clinical activity. MDM2 expression and TP53 muta- 
tions in the case of nutlin-3 sensitivity provide one example. Moreover, 
among 4,957 drug-gene associations found using elastic net modelling 
on each data set, we only observed one divergent result (0.02%) between 
the two studies. 

To further explore how the two data sets might be leveraged to identify 
genomic predictors of drug sensitivity, we performed a two-step analy- 
sis where predictors were identified using one data set and their effects 
were analysed in the other data set. Here, we used elastic net regres- 
sion to identify the genomic features and ridge regression to compare 
their effect across the data sets (Fig. 2b and Supplementary Discussion). 
Additionally, we performed this discovery step either on the overlapping 
cell lines or on all lines available in the respective studies. 

We again observed a high consistency of predictive genomic features 
identified across the CCLE and GDSC studies, even for drugs where 
few overlapping cell lines were available. Indeed, > 80% of these fea- 
tures identified with concordant directionality in both studies (Fig. 2c, d, 
Extended Data Figs 7-9 and Supplementary Data 6, features with same 
sign). In some instances, no predictors could be identified by the initial 
elastic net regression. This was often attributable at least in part to small 
numbers of drug-sensitive cell lines, as noted above (Extended Data 
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subset of therapeutically relevant interactions. FDR, false discovery rate. 
b-d, Elastic net and ridge regression analysis. b, Analytical strategy. c, 
Proportion of genomic features with consistent effect on drug response 

in both studies (total number of features tested displayed above the bar 
and number cell lines indicated in parentheses). d, Ridge regression using 
predictors selected by elastic net. Contrast, frequency of selection in 100 
independent elastic net runs. Green and red, association with sensitivity or 
resistance, respectively. 


Fig. 10). On the other hand, some drugs that exhibited low correlations 
based on the AUC or ICs9 analyses nonetheless enabled identification of 
consistent predictors (for example, nutlin-3; Fig. 2d). 

Together, these results indicate that the CCLE and GDSC pharma- 
cological data sets exhibit reasonable predictive power both separately 
and when taken as a whole. Many of the resulting drug response pre- 
dictions are well validated by prior knowledge and clinical evidence. In 
this regard, not only do the two sets of drug screening data exhibit broad 
convergence—they also provide examples of consilience: a phenomenon 
in which independent lines of experimental evidence, each with their 
own inherent limitations, arrive at fundamental scientific agreement. 


Discussion 
In summary, when analytical and biological considerations are incorpo- 
rated that reflect the nature of oncogenic dependency, pharmacological 
data from the CCLE and GDSC studies exhibit reasonable consist- 
ency. Based on positive Pearson correlations (R > 0.5), we observed 
agreement across the CCLE and GDSC data sets for the majority (67%) 
of evaluable compounds (two drugs with clear positive regression slopes 
showed R values just under 0.5 for the ICso values; Extended Data 
Fig. 1). We acknowledge that the consistency is not perfect: numerous 
biological and methodological components (for example, numbers of 
cell lines seeded per well, drug concentration range examined, number 
of cell doublings achieved, cell viability assays, analytical tools to cal- 
culate sensitivity values, and so on) undoubtedly reduced the statistical 
correlation of the overlapping pharmacological data. Further standardi- 
zation of such methodologies will certainly improve correlation metrics, 
and we welcome efforts in this direction. Nonetheless, both the CCLE 
and GDSC groups used standard methods for testing drug responses 
in cell lines, and this analysis confirmed that the consistency of their 
results seems reasonable in light of the aforementioned methodological 
differences. 

The identification of molecular predictors of drug response remains a 
major challenge for cancer precision medicine. Accordingly, large-scale 
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screening of clinically relevant compounds across molecularly anno- 
tated cancer cell line collections is likely to remain a crucial preclini- 
cal source for hypothesis generation. The CCLE® and GDSC? data sets, 
the two biggest public collections of genomic and pharmacological cell 
line data, have produced largely concordant results thus far, although 
rigorous comparisons should continue to be performed as these data 
sets evolve. Although neither data set is perfect on its own, they have 
both shown clear utility for predictive modelling studies and, in sev- 
eral cases, convergence onto known biological principles. Principled 
analytical frameworks (together with improved standardization) may 
conceivably illuminate additional areas of consilience through compar- 
ative studies of other functional screens (for example, RNA interference, 
CRISPR genome editing, phospho-proteomics, etc.) in the future. In all 
such instances, knowledge of the underlying biology should guide the 
implementation of those analytical and statistical methods best suited for 
comparative studies and, more generally, the extraction of meaning from 
large-scale screening data in cancer and other disease models. 


Online Content Methods, along with any additional Extended Data display items and 
Source Data, are available in the online version of the paper; references unique to 
these sections appear only in the online paper. 
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METHODS 


Power analysis. To estimate and compare the statistical power of Spearman and 
Pearson correlation tests, we ran the following simulation using synthetically gen- 
erated drug data: starting with 1,000 cell lines with w percent of them drug-sensitive 
(a = 2%, 5%, 10%, and 50%), we randomly selected a subset of N samples 
(N between 3 and 500) and calculated the Spearman and Pearson correlations 
between the two data sets over the N overlapping samples. We also calculated 
the statistical power for each test, by calculating the percentage of the time the 
corresponding P <0.05. In this analysis, we assumed that in both data sets, the 
drug response data has a Gaussian distribution with N(0,07) for insensitive and 
N(40,07) for sensitive cell lines. 

ANOVA. The ANOVA was performed on the data set corresponding to the over- 
lapping set of cell lines and using the genomic and tissue annotations and methods 
described in Garnett et al.’. 

Specifically, a vector of length n consisting of AUC (respectively ICso) scores 

for n cell lines was constructed for each drug. A linear (no interaction terms) 
ANOVA model was then fitted to these scores with factors including the cell line 
tissue type and the mutation status of 71 cancer genes, in turn. Significance and 
effect size (computed by using the Cohen's D) were obtained for each of the gene- 
drug pairs. This effect size measures the relative difference in the average AUC 
(respectively ICs) from the wild-type to mutant group compared to the AUC 
(respectively ICs9) pooled standard deviation of the two groups. P values were sub- 
sequently corrected for multiple hypothesis testing with the Benjamini-Hochberg 
method and a threshold of 20% FDR was used to identify significant associations. 
Subsequently, systematic unpaired Welch’s t-tests were performed to identify 
tissue/drug-response associations for the drugs showing response differences 
across different tissues, according the ANOVA models. 
Elastic net (EN). Since the ICs9 is not reported in CCLE when it exceeds the tested 
range of 8 1M, we used the activity area for the regression as in the original CCLE 
publication. We also used the values considered to be the best in the original GDSC 
study: the interpolated log(ICso) values. This setting might not be the most com- 
parable to the CCLE study, but it was felt to be more powerful from the standpoint 
of detecting bona fide associations. In order to compare features between the two 
studies, we used the same genomic data set (CCLE). 

For the GDSC regression we matched the CCLE genomic features cell line 
names to the GDSC cell line names. From the CCLE genomic features we used 
18,900 gene expressions, 1,643 genes probed for mutation (excluding 5’ UTR, 
introns, 3’ UTR, silent mutations), copy numbers for 446 genes (the Cosmic can- 
cer census genes) and 24 tissues. Elastic net regression was performed as described 
in Garnett et al.°, using 100 independent runs. For each iteration, the data are 
cross-validated with a random tenfold partition of the samples. Elastic net per- 
forms feature selection: unselected features have zero coefficients. Frequency is the 
proportion of models out of the 100 runs where the coefficient is non-zero. Note 
that when we restricted the drug responses to overlapping cell lines/drugs between 
the two studies, only 12/15 drugs had any features, both in the CCLE regression 
and the GDSC regression. Contingency tables of sensitive and resistant features 
in both studies are in Extended Data Fig. 6b. To assess significance of the overlap 
between the results, x statistics were computed on the two-by-two contingency 
tables of sensitive and resistant features (Extended Data Fig. 6a). 

Ridge regression. After performing elastic net regression as described above, 
the genomic features identified were applied to a ridge regression using the 


same responses as in the elastic net, that is, the ICs9 values for GDSC (including 
extrapolated values) and activity area for CCLE. Ridge regression was performed 
using all features selected by elastic net (Fig. 2b-d and Extended Data Fig. 7). In 
all plots the axes represent the weights attributed in the ridge regressions that 
were multiplied by the standard deviation of the features as in Garnett et al.°, 
and then standardized per drug. The data points are shaded according to their 
elastic net frequency. 

Several measures of consilience of drug-feature associations between the two 

studies were then computed (Extended Data Fig. 8). In order to gain statistical 
power, we also performed elastic net regression using all cell lines available in each 
study separately, and compared the results (Extended Data Fig. 8, second row). 
Consilience was computed using agreement proportion (that is, the proportion 
of drug-feature associations with the same sign between the two studies), cosine 
correlation and Spearman correlation. Cosine correlation conserves the sign of 
the compared values before scalar product and is therefore a better measure of 
consilience when only a few features are available for comparison (see for instance 
nutlin-3 in CCLE panels, Extended Data Fig. 8). 
Waterfall method for categorization of cell lines. Our implementation of the 
waterfall method follows the steps introduced in Barretina et al.8 and described by 
Haibe-Kains et al.’, with one exception: we only distinguished between sensitive 
and resistant cell lines. Hence, we removed the intermediate group altogether for 
simplicity, which allowed for a more straightforward comparison of the two studies. 
The specific approach we used is outlined as follows. 

The drug sensitivity measurements were extracted (ICs values for all cell 
lines measured in the GDSC and CCLE studies). This is a major difference to 
the Haibe-Kains et al. analysis’, as that analysis only considered the cell-lines 
in common between the studies when generating response distribution curves. 
Increasing logio(ICso) values were then sorted to generate a waterfall distribution. 
If the waterfall distribution is nonlinear (Pearson correlation coefficient to the 
linear fit <0.95), the inflection point of the logjo(ICs9) curve was estimated as 
the point on the curve with the maximal distance to a line drawn between the 
start and end points of the distribution. If the waterfall distribution appears linear 
(Pearson correlation coefficient >0.95), the median log;o(ICso) was used instead. 
Cell lines with logio(ICso) below this inflection point were classified as sensitive, 
whereas the rest were deemed resistant (see the estimated inflection points in 
Supplementary Data 2). 

Using this approach, we generated drug sensitivity calls for all cell lines within 

the GDSC and CCLE studies and employed the Cohen’s Kappa statistical analysis 
of agreement. 
Drug sensitivity analysis scenarios. Using the waterfall method, we estimated 
inflexion points (which ultimately differentiate between sensitive and resistant 
cell-lines), on all available/measured cell-lines (for a given drug), in order to have 
a more complete drug response curve and as a result, better sensitivity agreements 
(blue bars in Extended Data Fig. 3). 

Without the waterfall method, we used a fixed threshold of 1 uM for each drug, 
in order to distinguish between sensitive and resistant cell-lines. This was much 
simpler and faster than the previous approach, while generating similar results 
(green bars in Extended Data Fig. 3). 

Code availability. Most of the analyses performed in this paper were implemented 
in R. The R package containing the source codes can be accessed at http://www. 
broadinstitute.org/ccle/Rpackage. 
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Extended Data Figure 1 | Comparison of pharmacological data from the —__R and P value. In this representation, lower values denote insensitive cell 
CCLE and GDSC studies. a, b, Scatter plots (blue dots) represent the drug _ lines. The full distribution of sensitivity values for each drug and study is 


sensitivity measured as the area under the dose-response curve (a) and depicted as ‘violin plots’ (green, CCLE; purple, GDSC) and accounts for all 
ICso (b) in overlapping cell lines between CCLE and GDSC studies. For tested cell lines, as opposed to the overlapping set; the grey dot represents 
this analysis, ICs values for insensitive compounds were set to the highest the median, thick black line represents the first to third quartile range, and 


concentration tested in both data sets. The number of overlapping cell lines _ shape of the plot represents the kernel density of the distribution. 
n for each drug is indicated, as well as the Pearson correlation coefficient 
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Extended Data Figure 2 | Power analysis of Spearman and Pearson 
correlation tests. a, Example of a clear signal that appears in only 2% 

(20 out of 1,000) data points using synthetic data. The Spearman statistic 
completely fails to detect such a signal which is typical for selective cancer 
therapeutics. b, c, Expected Spearman and Pearson correlation coefficients 
between the two data sets assuming different percentages of drug-sensitive 
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cell lines (a = 2%, 5%, 10% and 50%) and different number of overlapping 
cell lines. The error bars depict + one standard deviation. d, e, Estimated 
statistical power for Spearman and Pearson correlation tests using a 

P value cutoff of 0.05 for rejecting the null hypothesis. This analysis was 
done using synthetic data as described in the Methods. 
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Extended Data Figure 4 | Overlap in ANOVA genomic correlates 

of drug sensitivity. a—d, Volcano plots showing ANOVA outcomes using 
drug responses from CCLE (left, a, c) or GDSC (right, b, d) data set from 
overlapping set of cell lines, and mutational status of 71 cancer genes from 
the GDSC. a, b, Analyses using AUC values. c, d, Analyses using ICs 
values. Points represent drug-gene interactions (with sizes proportional 
to the number of screened mutant cell lines). Positions on x axis indicate 
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effect size magnitudes: negative values (green circle) indicate mutations 
associated with increase in sensitivity, positive values (red circle) 
mutations associated with increased resistance. Positions on y axis indicate 
association significances (corrected P values) and the horizontal dashed 
line indicates a significance threshold (FDR 20%). Corresponding drug 
name, target(s) and cancer gene are reported for a subset of therapeutically 
relevant interactions. 
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Extended Data Figure 5 | Consistency of drug sensitivity/tissue-of- 
origin associations between the CCLE and GDSC data sets. Each 
point is a tested association between drug response and a given cell line’s 
tissue of origin. Positions of the points on the two axes correspond to 
‘signed log q-values’ of the corresponding tests for the two data sets, 
respectively. Point labels indicate drug names and targets (in italics) and 
tested tissue (among round brackets). The sign indicates the effect of the 


marker (neg = increased sensitivity and pos = increased resistance) and 

the magnitude indicates the log P value of the corresponding t-test, after 
correcting for multiple hypothesis testing. Fisher’s exact test P values for 
independence of columns and rows of the contingency table determined 
by sign and significance of the associations are also reported (over all the 
tests and for significant associations only, respectively). 
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Extended Data Figure 6 | Comparison of genomic features selected by 
elastic net between the CCLE and GDSC data sets. a, Consistency in 
predictors of response identified by elastic net regression across 21,013 
genome features (copy number variations, messenger RNA expression 
and sequence variants). Statistical significance of the number of genomic 
features identified in common (y’ test) using the GDSC and CCLE drug 


sensitivity data sets. Only drugs where features were found in both studies 


are represented. b, Corresponding contingency tables. Out of the 4,957 


drug-gene associations with non-zero elastic net weight coefficients, only 


one divergent result was found (weight coefficient with opposite signs), 
corresponding to a feature with the lowest possible frequency (non-zero 
coefficient in 1 out of 100 bootstrap trials in the elastic net analysis). 
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Extended Data Figure 7 | Comparison of genomic feature-drug 
associations in the CCLE and GDSC data sets. a, b, Ridge regression 
coefficients for all the drugs with successful elastic net regression in the 
indicated data set are plotted using either overlapping (a) or all available 
(b) cell lines. To select cell line features, elastic net was performed using 
the indicated data set. Then, ridge regression was performed on each data 


set using the selected features. For plotting, the weights associated with 
the features were multiplied by the standard deviation of the features as in 
Garnett et al.°, and then standardized per drug. Colour scale indicates the 
number of times a feature is selected in 100 independent runs of the elastic 
net. Green and red colouring indicate features associated with sensitivity 


or resistance, respectively. 
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Extended Data Figure 8 | Agreement in genomic predictors of drug 
response identified by elastic net regression in the GDSC and CCLE 
studies. Elastic net selection of genomic features was performed on the 
indicated data set and their effects were computed using a non-selective 
regression (ridge). Total number of features selected by elastic net is 
reported above the bars. Number of cell lines used in the regression is 


in parentheses on the x axis. Consistency is reported as the proportion 
of features with the overall same direction of effect (association with 
sensitivity or resistance): proportion of features with same sign, using 
either the cosine correlation that takes into account the sign associated 
with the features or the Pearson's correlation that does not 
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Extended Data Figure 9 | Gene expression correlates of drug response 
identified previously have better agreement when using more stringent 
FDR cut-offs. Data from Haibe-Kains et al.’. a, Scatter plots of the ICso 
based gene-drug association statistic (column “stat” in Haibe-Kains 

et al.’; Supplementary Data 2 and 3 and Extended Data Fig. 6) with FDR 
between 0 and 0.01 (purple), 0.01 and 0.05 (cyan), 0.05 and 0.2 (green). 

In each panel the two black lines intersect at the origin and define the 
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agreement quadrants (top right and bottom left quadrants). b, Proportion 
of genes in the agreement quadrants (same sign between the two studies). 
c, Additional measures of agreement between the two studies: Agreement 
measures increase with more stringent FDR cut-off, suggesting that 

false discovery drives agreement down. Uncentred measures (cosine 
correlation, uncentred covariance, agreement quadrant proportion) yield 
better agreement between the studies (see Supplementary Discussion). 
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Extended Data Figure 10 | Example of significant change in observed 
correlation by addition of a few sensitive cell lines. For lapatinib 
sensitivity data, there are 86 overlapping cell lines between the CCLE and 
GDSC data sets. a, Left panel is an excerpt from Haibe-Kains et al.’ 
figure 2 comparing the sensitivity data of lapatinib for the two data sets. 
b, Right panel shows the two sensitive cell lines (BT-474 and NCI-H1648) 
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that were omitted in the analysis of Haibe-Kains et al.’. The inclusion of 
these two cell lines drastically changes the observed Pearson correlation 
(from 0.25 to 0.53). This is consistent with the simulation results 
(Extended Data Fig. 2c) that show high variability in the observed Pearson 
correlation for low sample numbers. 
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Thermal biases and vulnerability to 
warming in the world’s marine fauna 


Rick D. Stuart-Smith!, Graham J. Edgar!, Neville S. Barrett!, Stuart J. Kininmonth!? & Amanda E. Bates* 


A critical assumption underlying projections of biodiversity change associated with global warming is that ecological 
communities comprise balanced mixes of warm-affinity and cool-affinity species which, on average, approximate local 
environmental temperatures. Nevertheless, here we find that most shallow water marine species occupy broad thermal 
distributions that are aggregated in either temperate or tropical realms. These distributional trends result in ocean- 
scale spatial thermal biases, where communities are dominated by species with warmer or cooler affinity than local 
environmental temperatures. We use community-level thermal deviations from local temperatures as a form of sensitivity 
to warming, and combine these with projected ocean warming data to predict warming -related loss of species from 
present-day communities over the next century. Large changes in local species composition appear likely, and proximity 
to thermal limits, as inferred from present-day species’ distributional ranges, outweighs spatial variation in warming 


rates in contributing to predicted rates of local species loss. 


The inherent vulnerability of ecological communities to global warm- 
ing, and therefore the magnitude of associated biodiversity change, is 
considered a function of exposure and sensitivity to warming, cou- 
pled with species’ adaptive capacity! *. Geographic models of future 
biodiversity change generally accommodate the magnitude, direction 
and distribution of temperature change**, but have limited ability to 
account for the sensitivity of communities to change. Our understand- 
ing of sensitivity to warming has been largely based on results of com- 
parative studies of species physiological tolerances and other life-history 
traits, often with extension from the laboratory to the field?-*. 
Extrapolation to whole ecological communities and large geographic 
scales, does, however, introduce substantial uncertainty, yet these are 
the scales critical for understanding natural ecosystem functioning”’, 
on which the well-being of human society depends. 

The few studies that have considered community-level sensitivity to 
warming*”"4 have not accounted for geographic patterns in species dis- 
tributions, inherently assuming that communities comprise balanced 
mixes of relatively warm-affinity and cool-affinity species, and with 
no spatial trends or regional consistency in any deviation from this. 
Regional variation in species composition may be influenced by numer- 
ous historical, ecological and phylogenetic factors that could potentially 
result in thermal bias of communities in relation to local environmental 
temperatures, with important implications for community-level sensi- 
tivity to warming. If, for instance, most species have a warmer affinity 
than the mean local temperature, then the local community may have 
little intrinsic sensitivity to negative change with warming. In this case, 
proxies previously used for inferring sensitivity, such as habitat type 
or integrity’, may provide limited predictive insight. Quantifying the 
direction and magnitude of community thermal bias is therefore an 
important step in improving our understanding of the sensitivity of 
ecological communities to structural reorganization with warming, and 
providing a more direct means to account for sensitivity in predictions 
of vulnerability. 


Thermal biogeography 
The community temperature index (CTI) is a measure (a community- 
weighted mean) of the average thermal affinity of ecological 


communities, and has recently been used to quantify warming in 
birds!>°, butterflies!” and fishes!®, and global commercial fisheries 
catches!°. Here we use the CTI of shallow-water marine fishes and 
invertebrates to test for thermal bias in the global distribution of 
marine communities in relation to local environmental temperatures. 

We constructed geographic and thermal distributions for 2,695 reef 
fish and 1,225 mobile macroinvertebrate species using occurrence 
records from two of the world’s most comprehensive databases for 
shallow-water marine species (Global Biodiversity Information Facility, 
http://www.gbif.org, and Reef Life Survey”*”!, http://www.reeflifesurvey. 
com), combined with remotely sensed long-term mean sea surface 
temperature (SST)**. We used the midpoint of the realized thermal 
distribution as a measure of the central thermal tendency for each 
species, or thermal affinity. On average, this aligns with the tempera- 
ture at which species occur at their maximum abundance in the field 
(see Methods), and is therefore a good proxy for the temperature of a 
species’ maximum ecological success. 

We then compiled the first global-scale data set of abun- 
dance-weighted CTI values from systematic quantitative sampling, 
using abundance data for all fish and invertebrate species recorded 
on standardized visual censuses at 2,447 sites by the Reef Life Survey 
(RLS) program (see Methods; Extended Data Fig. 1). This approach 
thus incorporates patterns in species dominance related to thermal 
affinity. 

A nonlinear global pattern is evident in CTI values, with relatively 
little change with increasing temperature in tropical and temperate 
regions, and a rapid increase in subtropical regions creating a distinct 
step (Fig. 1 and Extended Data Fig. 2a, b). This pattern is consistent 
between fishes and invertebrates (Pearson correlation = 0.98; n = 2,383; 
P<0.01) and is the same when CTI is calculated without weighting by 
abundance (that is, using presence data; Extended Data Fig. 2c, d). A 
direct result of this nonlinearity in global CTI is that the majority of 
locations are characterized by marine communities with either higher 
or lower CTI than would be expected from local SST (Extended Data 
Fig. 3). Thermal bias is ubiquitous among these communities, which 
are typically numerically dominated by species with warmer or cooler 
affinity than the local environment. 
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Figure 1 | Global community temperature index values for reef fishes 
and invertebrates against mean annual sea surface temperature. 
a, b, Tropical and temperate communities are separated by subtropical 
transitions in which communities largely comprise a mixture of temperate 
and tropical species. A line with a slope of one is plotted for reference. 
n=2,175 and n= 1,901 sites for fishes and invertebrates, respectively, after 
exclusion of sites with confidence scores <2.5 (see Methods). 


The proximate cause of large-scale patterns of thermal bias is that 
marine species distributions do not follow the monotonic latitudinal 
and temperature gradients observed in species richness”*™*. Instead, we 
find that the majority of species studied have ranges centred in either 
temperate or tropical zones (Extended Data Fig. 4), and consequently 
show a corresponding multimodal distribution of the thermal affinities 
(that is, thermal guilds; Fig. 2). This trend is consistent when consid- 
ered for different ocean basins and biogeographic regions. Additional to 
the major temperate/tropical dichotomy, the invertebrate data suggest 
the presence of a third, subpolar thermal guild (Fig. 2b). 

Thermal guilds align with the theory that temperature can be con- 
sidered as an ecological resource in freshwater fishes*®, and can be 
distinguished within other independent data sets of marine species 
(see Supplementary Information). The findings of globally coherent 
thermal guilds is not the result of spatial sampling structure of the 
data, such as a consequence of relatively few surveys in the subtropics; 
a latitudinal transect along the well-surveyed north-south trending 
eastern Australian seaboard clearly distinguishes tropical from temper- 
ate faunas along the full cline (Extended Data Fig. 5). There are several 
potential, non-mutually exclusive mechanisms that may explain these 
findings: (1) fewer shallow-water species may have ranges centred in 
subtropical ocean climates as a result of less continental shelf area at 
subtropical latitudes globally”®; (2) historical biogeographic processes 
could be implied for the Australian fauna, through mixing of trop- 
ical Pacific/southeast Asian and temperate Australian faunas as the 
Australian continental plate drifted north, with species conserving 
thermal preferences (that is, phylogenetic inertia’’); (3) tropical cen- 
tres of speciation and subsequent colonization of temperate regions 
through ‘bridge species’ may have occurred (the ‘out of the tropics’ 
hypothesis”), and is supported by the distributions of thermal affinities 
of species in large families of fishes that span temperate and tropical 
zones (Extended Data Fig. 6); (4) there could be adaptive advantages 
associated with specialization for either warm or cool temperature 
ranges, with trade-offs in metabolic processes reducing widespread 
adaptation to intermediate temperatures. 

Regardless of the ultimate drivers, the existence of consistent thermal 
guilds and associated global-scale patterns of thermal bias has impli- 
cations for whether the net community response to warming is more 
likely to be positive or negative (in terms of abundance changes). It also 
raises the possibility that communities in some locations may be more 
vulnerable to losing species than in other locations, simply on the basis 
of the direction and magnitude of the bias in the thermal distributions 
of the species present. 


Vulnerability of marine communities to warming 

Most previous biodiversity vulnerability analyses have focused on 
species, and their ability to change their geographic distribution or 
adapt to avoid global extinction'®’®. Here we quantitatively assess the 
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Figure 2 | Frequency distributions of fish and invertebrate species 
according to their thermal distribution midpoint show modes of 
temperature affinity or tropical (red), temperate (blue) and subpolar 
(white) thermal guilds. a, b, Species for which confidence in thermal 
midpoints was low are excluded (see Methods). 


vulnerability of whole communities—groups of species that are cur- 
rently recorded as co-occurring and interacting at an ecologically rel- 
evant scale. A local ecological community is considered vulnerable if it 
is likely to lose many of its constituent species. This may not translate 
to reductions in overall species richness (although see below), but does 
reflect a relative vulnerability to change in community structure and 
ecosystem functioning, and contrasts with desirable management goals 
of resilience or stability in the face of warming”. 

Over decadal scales, positive thermal bias of the magnitude observed 
for some locations in this study (for example, where the mean thermal 
affinity of the community is 3°C greater than local mean SST) is much 
greater than predicted ocean warming rates of <0.4°C per decade, and 
may translate to low probabilities of species loss as a result of warming, 
or relatively low community sensitivity to negative change. Most species 
in such locations are also found in other warmer locations, and so are 
unlikely to be negatively affected by warming. However, the likelihood 
of local loss of species on the basis of increasing temperature will be 
more dependent on how close each of the species is and becomes, at 
that location, to the maximum of its thermal distribution, rather than 
from the midpoint (as used to define thermal bias in our thermal bio- 
geographic analysis). To account for this, we recalculated CTI using 
the 95th percentile of species’ thermal distributions as a measure of 
contemporary realized upper thermal limits (CTI max). Realized upper 
limits will be lower than fundamental limits based on physiological tol- 
erances, but arguably better reflect real-world limits, where species not 
only need to survive physiologically, but also persist in a competitive 
and predatory environment. 

For calculation of CTI max to estimate species loss with warming, we 
used presence rather than abundance data and combined RLS sur- 
vey data for fishes and invertebrates, thereby covering the majority of 
macroscopic mobile fauna (>2.5cm) on rocky and coral reefs at sites 
investigated. We re-calculated thermal bias (TBias,,x) as the difference 
between CTI nax and mean summer temperatures (mean SST from the 
8 warmest weeks annually from 2008-2014 (ref. 30)). This can be con- 
sidered a form of ‘distribution safety margin’, and shows a similar 
global pattern to that shown in our thermal biogeographic analysis 
(Extended Data Fig. 7), with CTIax and CTI very closely related 
(Pearson correlation = 0.96; n = 2,089; P<0.01). 

CTI max also shows a stepped relationship with summer SST 
(Extended Data Fig. 8), reflecting some consistencies among species’ 
realized upper thermal limits within tropical and temperate regions at 
the global scale. For example, CTImax remains between 22 °C and 24°C 
across most sites with summer temperatures ranging from 14°C to 
24°C, implying that the average species is living closer to their warmest 
distributional margin at locations with summer temperatures around 
approximately 24°C than at locations which experience summer 
temperatures around approximately 14°C. TBiasax is consequently 
more positive for the latter, although sites dominated by species in the 
tropical thermal guild (as identified in Figs 1 and 2) that experience 
summer temperatures around approximately 24°C (that is, on the 
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Figure 3 | Vulnerability of marine communities to warming-related 
local species loss. a-d, Proportion of fish and invertebrate species in 
present-day communities likely to exceed their upper realized thermal 
limit by 2025 (a) and 2115 (c) based on regional IPCC warming rates 
(RCP8.5 scenario), and in relation to the magnitude of community 


upper line in Extended Data Fig. 8) also have high TBiasma, and 
inferred low sensitivity. 

Although TBias,,x can be considered a form of community-level 
sensitivity, it does not account for warming rates, another important 
component of vulnerability'”. To explicitly account for spatial patterns 
in warming rates and provide quantitative vulnerability predictions for 
marine communities, we further calculated the proportion of species 
in the community that would exceed the upper limit of their realized 
temperature distribution in 10 and 100 years from the present. These 
are based on each species’ contemporary upper thermal limits, recent 
summer temperatures, and the rate of warming expected at each site 
(based on ensemble averages from all climate models included in the 
Intergovernmental Panel on Climate Change (IPCC) Fifth Assessment 
Report (ARS5) for sea surface temperature anomaly under the RCP8.5 
scenario predicted for 2050-2099; http://www.esrl.noaa.gov/psd/ipcc/). 

A total of 6 (out of 75) ecoregions included in the analysis were 
identified in which the mean summer sea temperature is expected 
to exceed the upper thermal limit of more than 50% of the recorded 
species by 2025 (Fig. 3a, b). Confidence scores for CTI pax values are 
low for a number of sites in three of these ecoregions on the basis 
of less comprehensive sampling of species thermal distributions (see 
Methods and Extended Data Table 1), but were high for sites in the 
Gulf of Thailand, southwestern Caribbean and Three Kings-North 
Cape (New Zealand). Longer-term predictions are more extreme, 
with 100% of the present-day community composition apparently 
likely to exceed upper thermal limits in approximately one-third of 
surveyed ecoregions by 2115 (Fig. 3c, d). These are distributed in all 
ocean basins across the tropics, but also in some temperate areas such 
as the Great Australian Bight. 

Locations of greatest predicted species loss do not closely align to 
locations of greatest warming, but instead correspond closely to the 
magnitude of thermal bias (measured as TBias,ax; Fig. 3b, d) GAMM 
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thermal bias (measured as TBiasmax; b, d). Fitted curves (solid black line) 
and 95% confidence intervals (dotted black lines) are from GAMM models 
(Extended Data Table 2). Sites with confidence scores <2.5 were excluded 
from most ecoregion*’ means (see Extended Data Table 1 for sample sizes 
and details of exclusions). 


results in Extended Data Table 2). This result is robust to the warming 
data used (see Supplementary Information), and shows that sensitivity 
associated with community thermal bias is an important component of 
vulnerability. Our results further indicate that exposure, and variability 
in warming rate predictions, may be considerably less important than 
previously suggested’ when it comes to local loss of marine species 
over the next century. Predicted species loss at locations with lower 
thermal bias is considerably greater than at locations with higher ther- 
mal bias, despite some of the world’s most rapidly warming regions 
occurring within the latter. The western Mediterranean, for example, 
is predicted to warm by 0.24-0.29 °C per decade (depending on pre- 
dictions used), but typical marine communities there consist of species 
with contemporary upper limits well above local summer SST (mean 
TBiasmax = 6.3 °C £ 1.1 s.d.). 

Our predictions do not account for local influx of warmer affinity 
species, and do not comprise the only form of community-level vul- 
nerability to warming. Rather, they describe effects of an additional 
component of ecological vulnerability. Species influx and warming- 
associated changes in species abundances will also contribute to local 
ecological change and are already occurring in the most rapidly warm- 
ing areas that are well-connected to rich tropical faunas, such as south- 
eastern Australia!*. An influx of warm-affinity species may replace lost 
species or lead to accumulating richness in some regions, and probably 
have dramatic impacts on ecological processes®*". Local species loss 
through extinction or range contraction will represent the main form 
of community change probable for low-latitude regions for which no 
pool of warmer affinity species exists!!**, however, and so our predic- 
tions probably cover the major changes in composition expected in 
these regions. 

A key assumption for our vulnerability analysis is that local extinc- 
tion becomes more probable when a site becomes warmer than the 
typical maximum temperature at which a species has previously been 
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observed. This assumption relies on the interactive mechanisms which 
presently set boundaries on species’ ranges remaining consistent, such 
as thermally driven performance reduction*** and increased suscep- 
tibility to competition and predation'®*>, This is unlikely to be true 
for all species, especially narrow-range endemics which are probably 
limited in distribution by factors other than temperature’*. Regardless, 
we consider this generalization reasonable given the well-connected 
nature of the marine environment, typically with large geographic 
ranges*°, and often closely matching fundamental (assessed in labora- 
tory experiments) and realized (field-derived from distribution data) 
thermal niches””, as well as implications associated with lower concen- 
trations of dissolved oxygen in the marine environment with increasing 
temperature*®, 

Our vulnerability predictions also do not account for ecological 
change resulting from extreme events, which will change biodiversity 
in spatially variable and largely unpredictable ways. This is particu- 
larly true for indirect effects of extreme events, such as through habitat 
change, which place critical pressures on biodiversity’, and represent 
an important direction for future research. 

Additional caveats associated with assessing vulnerability in terms 
of local loss of species from present-day communities include: (1) the 
upper thermal limits for many tropical marine species could exceed 
contemporary ocean temperature maxima, and (2) adjustment and 
thermal adaptation could reduce species loss from that predicted. The 
former does not affect results for temperate regions, but could lead to 
lower vulnerability than predicted for tropical regions, despite results 
of laboratory experiments that have applied greater temperatures than 
contemporary SST, suggesting that maximum thermal tolerance levels 
are more constrained for tropical than temperate species!!””". Because 
of these caveats, we emphasize that absolute values presented in 
Fig. 3 should be considered as a ‘worst case scenario’ and interpreted 
with caution. Nevertheless, relative differences in the magnitude of pre- 
dicted change between regions and times should be robust, other than 
perhaps overestimation of site-scale species loss at the lowest latitudes 
relative to cooler climes. Most importantly, the strength of empirical 
trends indicates that thermal bias is a fundamental element affecting 
global variability in future biodiversity change. 


Tracking and managing warming impacts on 
biodiversity 

In contrast to prior global studies of potential biodiversity losses associ- 
ated with climate change, which typically consider loss of species from 
their full distribution or use regional species lists inferred from range 
maps, our study focused on probabilities of local-scale losses from 
assemblages of interacting species. These will be much more perva- 
sive than cases of global extinction, and have important consequences 
with respect to the way ecosystems currently function. We identify a 
substantial pressure of warming through the future, with an alarmingly 
large proportion of species predicted to exceed current realized thermal 
limits based on current distribution patterns. 

Our results imply that locations at which the average summer SST 
is presently approximately 24°C are most vulnerable to community 
change in general. This temperature corresponds to the upper realized 
thermal limit of many temperate species, and consequently a ceiling 
on CTI nax for most temperate communities. For locations with con- 
nections to tropical faunas, it is also where the influx from the large 
pool of tropical species is going to be greatest. By contrast, the warmest 
tropical locations are likely to suffer from local loss of species with little 
replacement, a result consistent among other studies relating biodiver- 
sity change to global variation in predicted ocean climate velocity*”®. 

Management options for decreasing local marine species losses 
resulting from warming are limited; nevertheless, reducing the effects 
of other threats such as pollution, invasive species, and excessive extrac- 
tion of living resources, will probably provide the best opportunities 
for prolonging persistence of species at the warm end of their range. 
Although some local losses of species appears inevitable, management 
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can bolster community resilience to ocean warming through strate- 
gies to reduce influx of warm-affinity species at those regions where 
accumulation is predicted. Actions to support more intact naturally 
functioning communities are recommended, including implementa- 
tion of marine protected areas (MPAs) and more conservative fisheries 
management. Recent evidence from an effective temperate MPA sug- 
gests that local predators hinder poleward progression of warm-affinity 
species'®, and invasion theory more generally predicts intact and 
diverse natural communities possess greater resistance to invasive 
species than degraded communities"? 

Abundance-weighted CTI, as used in our thermal biogeographic 
analysis, offers an important tool for measuring the success of such 
management actions, as it integrates signals from local species gains 
and losses, and also abundance shifts related to temperature. The 
CTI provides a powerful metric for tracking long-term biodiversity 
change in relation to warming over larger scales!°, and for informing 
the wider public of the magnitude of warming impacts on biodiversity. 
It can thus fill a critical gap in the indicator suite used for assessing 
progress towards international targets agreed under the Convention 
on Biological Diversity (CBD). However, we must consider for such 
application that the magnitude of CTI change will be nonlinear across 
latitude, with reduced scope for change in tropical regions. The CTI 
offers an important opportunity to extend emphasis from charts 
or maps of pressures, such as atmospheric CO, concentrations and 
ocean heat content’, towards measures of biodiversity change, thereby 
providing a better understanding of real-life consequences of ocean 
warming for effective long-term change in policy and human behaviour. 


Online Content Methods, along with any additional Extended Data display items and 
Source Data, are available in the online version of the paper; references unique to 
these sections appear only in the online paper. 
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METHODS 


Reef fish and invertebrate data. Standardised quantitative censuses of reef fishes 
and echinoderms (holothurians, echinoids, asteroids, crinoids), molluscs (gastro- 
pods, cephalopods), and crustaceans (decapods) were undertaken by trained rec- 
reational SCUBA divers along 7,040 transects at 2,447 sites worldwide through the 
Reef Life Survey (RLS) program. Full details of fish census methods are provided 
in refs 20, 21, and an online methods manual (http://www.reeflifesurvey.com) 
describes all data collection methods, including for invertebrates. Data quality and 
training of divers are detailed in ref. 20 and supplementary material in ref. 24. Data 
used in this study are densities of all species recorded per 500 m” transect area for 
fishes (2 x 250 m” blocks), and per 100m? for invertebrates (2 x 50m? blocks). Four 
per cent of all records were not identified to species level (mostly invertebrates) 
and were omitted from analyses for this study. 

Data from fish and invertebrate surveys were analysed separately for thermal 

biogeography analyses, but combined for the vulnerability predictions shown in 
Fig. 3. Although collected on the same transect lines, these survey components 
cover different areal extents, and so were combined to represent densities per 
50m? (block size for invertebrate surveys). Raw invertebrate data were therefore 
used, but one in five individual fishes were randomly subsampled from those 
surveyed in each 250 m” block to provide equivalent densities and richness of 
fishes per 50m”. 
Characterization of species’ thermal distributions. A realized thermal dis- 
tribution was constructed for all species recorded on RLS transects, based on 
occurrences rather than species distribution models. All individual records within 
the RLS database were combined with all records of these species in the Global 
Biodiversity Information Facility (GBIF: http://www.gbif.org/), after applying 
filters to limit records to depths shallower than 26 m and time of collection since 
2004. This resulted in a data set of 399,927 geo-referenced occurrences of 3,920 
species. 

Remotely sensed local SST data were then matched to each occurrence location. 
Long-term mean annual SST values from 2002-2009 from the Bio-ORACLE data 
set*” were used to provide a time-integrated picture of temperatures species were 
typically associated with for the thermal biogeographic analysis. The fifth and 95th 
percentiles of the temperature distribution occupied by each species were then 
calculated, and the midpoint between these used as a measure of central tendency 
of their realized thermal distribution. Midpoints were considered a reasonable 
proxy for the temperature associated with species’ maximum ecological success, 
confirmed by a close alignment of midpoints with the temperatures at which spe- 
cies occurred in maximum abundance in the global RLS data set (slope of midpoint 
versus temperature of sites at which species were at maximum abundance = 1.003, 
Pearson correlation = 0.93, P< 0.001). Thus, although interspecific variation is 
expected, deviation in temperatures either side of the midpoint results in reduced 
abundance for the average species. 

We also calculated and explored other metrics from the thermal range, includ- 

ing the median and mode, but these were more sensitive to the distribution and 
intensity of sampling effort across the temperature range of species, and therefore 
less robust than the midpoints. Fifth and 95th percentiles were deliberately chosen 
as endpoints rather than the maximum and minimum because marine species 
range boundaries are not static, with dynamic tails in distributions. Sightings of 
individual vagrants are common, sometimes at large distances from the nearest 
viable populations. Furthermore, any misidentification errors would have greatest 
influence if at the edge of species ranges. 
Community temperature index calculation and thermal bias. CTI was calcu- 
lated separately for fishes and invertebrates for each transect in the RLS database 
as the average of thermal midpoint values for each species recorded, weighted 
by their log(x + 1) abundance. Multiple transects were usually surveyed at each 
site (2.8 transects global mean across sites used in this study). CTI values were 
averaged across these to create a site-level mean that was used for analyses. In 
some cases this averaged out seasonal effects, where sites were surveyed across 
multiple seasons. 

Thermal bias was calculated as the difference between the CTI and mean annual 

SST at each site. Mean thermal bias values across sites surveyed in each ecoregion 
are shown in Extended Data Fig. 3, with sample sizes for ecoregions shown in 
Extended Data Table 1. 
Confidence scores. The number of occurrence records for each species ranged 
from a single record (numerous species) to 1,009 (the Indo-Pacific cleaner wrasse, 
Labroides dimidiatus), with an overall mean of 36 records (47 for fishes, 16 for 
invertebrates). In order to consider how variation in the comprehensiveness of 
data on the thermal distribution for each species affected the calculation of CTI 
and provide an objective measure of confidence in site-level CTI values, we used 
a semiquantitative confidence scoring system. A confidence value ranging from 
one (very little confidence) to three (high confidence) was allocated to each species 
through a four-step process: 
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(1) The number of records (sites) for each species was used as a first pass for 
classification, with species observed at 30 or more sites given a value of three, 10-29 
sites a value of two, and less than 10 sites, a value of one. 

(2) The thermal range for each species (the difference between 95th and fifth 
percentiles) was used in a second pass for all species that were initially given a value 
of two. For this, those species with a thermal range of less than 3 °C were reduced 
to a value of one, as it is possible these species have not been surveyed across their 
full potential thermal range. 

(3) Species with a value of three and a thermal range of less than 1°C were 
reduced to a two, given these likely represent well-sampled, but range-restricted 
species, and their potential thermal range is likely greater than their realized range 
(which is probably limited by other factors such as dispersal or historical bio- 
geography). 

(4) The frequency of occurrences across temperatures was also plotted sepa- 
rately for each species. Frequency histograms were visually inspected as a last pass, 
and confidence scores reduced by one if the thermal distribution appeared to be 
unduly influenced by widely separated records. 

We then recalculated CTI for using confidence scores for each species, weighted 
by their abundance (also log(x+ 1) transformed), creating a CTI confidence score 
for each transect and each site. A mean site confidence score of >2.5 was used as 
a cut-off for many analyses and figures, as indicated in figure captions. Although 
a score of 2.5 can be achieved in many ways, this effectively represents at least 75% 
of the individuals present belonging to species with the maximum confidence 
score of three. 

Thermal guilds. Given few truly subtropical species were identified in this study, 
and this outcome could potentially result from bias in the distribution of sam- 
pling effort towards areas outside of subtropical locations (see Supplementary 
Information for more detail), we replicated Fig. 2 along a comprehensively 
sampled latitudinal gradient in Australia. The majority of Australian species are 
well-sampled across their geographic distributions and numerous sites have been 
surveyed in subtropical locations in Australia. We divided the RLS data from 968 
sites into 10° latitudinal bands along the east coast of Australia (and Papua New 
Guinea and Solomon Islands) from the equator to 43.7° S, and plotted histograms 
of thermal distribution midpoints of 1,105 species with a confidence of two or 
three (Extended Data Fig. 6). These clearly show very few species with midpoints 
of 23-24°C, even in the band from 20° S to 30° S where the mean annual SST of 
sites was 23.97 °C. They also show the intrusion of numerous tropical species in 
temperate latitudes, particularly for fishes. 

Vulnerability predictions. Vulnerability predictions required characterization 
of the warmest temperatures experienced by species across their range. We re- 
constructed the thermal distributions for each species using the maximum of the 
weekly mean SST from all occurrence sites over the 12 weeks before the sampling 
date, obtaining the 95th percentile of these. We then calculated the difference 
between this value and the mean of summer temperatures (the mean of the warm- 
est 8 weeks was taken for each year between 2008 and 2014, with the mean of these 
used). This is analogous to a form of thermal safety margin, although in this case 
it does not mean a species cannot survive if the summer SST exceeds the 95th 
percentile, but rather that it has been recorded at very few sites in the combined 
RLS and GBIF databases at times in which the temperatures exceeded this value. 

We re-calculated this value for 10 years and 100 years from present, using rates 
of SST warming projected by coupled climate models’ CMIP5 PCP8.5 scenario, 
calculated and freely provided by the NOAA Ocean Climate Change Web Portal 
(http://www.esrl.noaa.gov/psd/ipcc/ocn/). Sea surface temperature anomaly 
(difference in the mean climate in the future time period, 2050-2099, compared 
to the historical reference period, 1956-2005) was selected as the statistic repre- 
senting the average of 25 models, interpolated to a 1° latitude by 1° longitude grid 
and matched to each RLS site. Summer SST was predicted for each RLS site for 
10 and 100 year time periods using these values. Vulnerability was then estimated 
as the proportion of all species (fishes and invertebrates) recorded on each RLS 
survey that is expected to exceed the 95th percentile, based on the predicted SST 
at that site. This component of analyses did not incorporate abundance data, as 
the goal was to assess local species loss, rather than loss of individuals. Weighting 
by abundance had little influence on conclusions, however. 

Confidence scores were also recalculated without abundance (and thus repre- 
sent the mean confidence of species present), and sites with confidence scores <2.5 
were excluded from calculation of ecoregion means for all ecoregions with three 
or more sites with confidence >2.5. Twenty-one of 81 ecoregions had fewer than 
three sites with confidence >2.5 with which to calculate means, so low confidence 
sites were included in means for these ecoregions. The effect of this is conserva- 
tive, theoretically reducing thermal bias (see Supplementary Information), but the 
rationale was that ecoregion means would be more accurate through their inclusion 
than if heavily weighted by few sites. To provide an additional cut-off for ecoregions 
in which the overall mean confidence was still low, we excluded ecoregions with 


© 2015 Macmillan Publishers Limited. All rights reserved 


ARTICLE 


mean confidence <1.75. This resulted in the exclusion of six ecoregions (North 
and East Barents Sea, Oyashio Current, Agulhas Bank, Sea of Japan/East Sea, Gulf 
of Maine/Bay of Fundy, Malvinas/Falklands). 

To explore the contributions of warming rates and thermal bias to vulner- 
ability predictions, we also recalculated CTI as the mean 95th percentiles of 
fish and invertebrate species recorded on transects (CTI max) and thermal bias 
(TBiasmax) as the difference between site-level CTI ax and mean summer SST. 
TBiasmax can therefore be considered the sensitivity component of the vulner- 
ability predictions, based on recent mean summer SST and not accounting for 
warming rates (exposure). We applied GAMMs to assess vulnerability scores as 
a function of TBias,,ax and warming rates, with ecoregion as a random factor 
(Extended Data Table 2). 


Conclusions are robust to the warming data used, with qualitatively similar 
results using historical warming data from another source’, instead of future predic- 
tions (site warming rates in °C per decade taken from http://www.coastalwarming. 
com/data.html), and ecoregion mean vulnerability scores changing very little when 
the 99th percentile of species’ thermal distributions were used instead of the 95th 
percentile, even for 2115 predictions (Pearson correlation = 0.97, P< 0.01). 
Data reporting. No statistical methods were used to predetermine sample size. 
The investigators were not blinded to allocation during experiments and outcome 
assessment. 


44. Bates, A. E. et al. Distinguishing geographical range shifts from artefacts of 
detectability and sampling effort. Divers. Distrib. 21, 13-22 (2015). 
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Extended Data Figure 1 | Sites used in analyses at which fish and invertebrate communities were surveyed by the Reef Life Survey program. 
Numerous points are overlapping and hidden (n = 2,447). Ecoregion boundaries are shown in grey lines. 
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Extended Data Figure 2 | Community temperature index values 

for reef fishes and invertebrates against mean annual sea surface 
temperature. a—d, CTI calculated using abundance-weighted fish (a) and 
invertebrate (b) data, and including sites at which mean CTI confidence 
scores were less than 2.5 (n= 2,447 and 2,383 for fishes and invertebrates, 
respectively). Sites are colour-coded by ecoregion to help distinguish 
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spatial patterns, but as a result of numerous ecoregions (n = 81), many 
ecoregion colours are similar. CTI calculated using presence-only fish (c) 
and invertebrate (d) data, and excluding sites with confidence scores <2.5 
(n= 2,188 and 1,812 for fishes and invertebrates, respectively). Dotted 
lines have a slope of one, plotted for comparison with data. 
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Extended Data Figure 3 | Global distribution of reef fish and with warmer thermal affinity than mean local sea temperatures. Colours 
invertebrate community thermal bias. a, b, Community thermal bias are scaled to the mean thermal bias of sites surveyed within each ecoregion 
(°C) is the difference in abundance-weighted CTI from local long-term (see Extended Data Table 1 for sample sizes). Only ecoregions with sites 
mean annual sea surface temperature. Positive regions (warm colours) that were surveyed are included. 

encompass ecological communities with a predominance of individuals 
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Extended Data Figure 4 | Frequency distribution of fish and invertebrate species’ latitudinal range midpoints. a, b, Species for which confidence in 
thermal distribution midpoints (and therefore geographical distribution midpoints) was low are excluded (see Methods). 
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Extended Data Figure 5 | Frequency distribution of fish (left) and invertebrate (right) species’ thermal distribution midpoints in 10° latitudinal 
bands from Papua New Guinea and down eastern Australia (rows). a—j, Note y axes are on different scales and only species with confidence scores of 


two and three are included (see Methods). 
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Extended Data Figure 6 | Frequency distribution of thermal distribution midpoints of species in major fish families spanning temperate and 
tropical zones. Note y axes are on different scales and only species with confidence scores of two and three are included. 
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Extended Data Figure 7 | Global distribution of TBias,,,, of reef faunal 
communities. TBiasmax is calculated as the difference between CTI max 
(using the 95th percentiles of species’ thermal distributions and presence 
data) and mean summer SST. Colours are scaled to the mean TBiasmax 


undertaken are included. 
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Extended Data Figure 8 | The CTI pax (mean 95th percentile of species 
thermal distributions) for reef faunal communities across temperate 
(blue), tropical (red) and subtropical (grey) sites. SST data are means 
of the warmest 8 weeks of the year over the survey period (2008-2014). 
Points represent the surveyed community of fishes and invertebrates 

at each site (n = 2,091, only confidence scores >2.5). Regression lines 
are fitted to the maximum values within each ecoregion, with separate 
regressions fitted for sites categorised from Fig. 1 as temperate, tropical 
and subtropical. 
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Extended Data Table 1 | Ecoregion means, sample sizes and vulnerability predictions 


ECOREGION Group # sites # sites # sites TBiaSmax Vulnerability Vulnerability 
(Fig 1) (ED Fig 3a) (ED Fig 3b) (Fig 3, ED Fig 7) (ED Fig 7) 2025 (Fig 3b) 2115 (Fig 3d) 
‘Adriatic Sea TE 1(1)* (2)* (2)* 7.08 0.00 0.32 
Agulhas Bank TE (11)* (11)* 
Arnhem Coast to Gulf of Carpenteria TE 12 (0) 12 0.85 
Azores Canaries Madeira TE 6 (57) 12 (51) 0.99 
Bassian TE-SP 236 (5) 237 (4) 0.03 
Bight of Sofala/Swamp Coast TR 3 (0) 0.99 
Bismarck Sea TR 9 (1) 1.00 
Bonaparte Coast TR 15 (1) 0.58 
Cape Howe TE 183 (0) 0.03 
Celtic Seas SP 3 (6)* 0.58 
Central and Southern Great Barrier Reef TR 62 (0) 0.83 
Central Kuroshio Current ST (9)* 0.94 
Channels and Fjords of Southern Chile SP (8)* 0.21 
Chiapas-Nicaragua TR 14 (0) 1.00 
Chiloense SP (9)* 0.58 
Cocos Islands TR 23 (1) 1.00 
Cocos-Keeling/Christmas Island TR 15 (0) 0.03 
Coral Sea TR 154 (0) 0.79 
Cortezian TR (8)* 0.42 
East African Coral Coast TR 9 (0) 0.96 
Easter Island ST 2 (15)* 0.26 
Eastern Brazil TR (10)* 1.00 
Eastern Galapagos Islands ST 36 (0) 1.00 
Exmouth to Broome TR 176 (3) 176 (3) 0.05 
Fiji Islands TR 9 (0) 9 0.02 
Floridian TR 17 (0) 16 (1) 0.06 
Great Australian Bight TE 3 (0) 1.00 
Greater Antilles TR 1 (0) 1.00 
Guayaquil ST 16 (0) 1.00 
Gulf of Maine/Bay of Fundy SP 1(7)* 
Gulf of Thailand TR 7 (0) 1.00 
Hawaii TR 2(7)* 0.79 
Houtman TE 32 (0) 0.36 
Kermadec Island TE 14 (0) 0.24 
Leeuwin TE 69 (7) 0.25 
Lesser Sunda TR 11 (0) 0.04 
Levantine Sea ST 6 (0) 0.46 
Lord Howe and Norfolk Islands ST 94 (3) 83 (14) 91 (6) 0.17 
Maldives TR 12 (0) 10 (2) 12 1.00 
Malvinas/Falklands SP (5)* (5)* 
Manning-Hawkesbury ST 141 (0) 133 (8) 140 (1) 0.10 
Marquesas TR (7)* 6 (1) 3 (4) 1.00 
Marshall Islands TR 16 (0) 11 (2) 11 0.72 
Nicoya TR 93 (0) 389 (4) 89 (1) 1.00 
Ningaloo TR 30 (0) 0.05 
North and East Barents Sea SP (2)* 
North Patagonian Gulfs TE-SP (13)* 0.44 
North Sea SP 6 (0) 0.67 
Northeastern New Zealand TE 75 (28) 0.52 
Northern and Central Red Sea TR 13 (4) 0.22 
Northern California TE-SP (8)* 1.00 
Northern Galapagos Islands TR 11 (0) 1.00 
Oyashio Current SP (4)* 
Panama Bight TR 40 (0) 1.00 
Papua TR 30 (3) 1.00 
Phoenix/Tokelau/Northern Cook Islands TR 12 (0) 0.62 
Puget Trough/Georgia Basin SP (8)* 0.38 
Rapa-Pitcairn ST 5 (0) 0.77 
Samoa Islands TR 25 (0) 0.26 
Sea of Japan/East Sea TE-SP (6)* 
Seychelles TR 12 (0) 0.57 
Shark Bay ST 6 (0) 0.13 
Society Islands TR 17 (0) 0.55 
Solomon Archipelago TR 5 (0) 1.00 
South Australian Gulfs TE 71 (0) 0.72 
South Kuroshio TR 8 (0) 1.00 
South New Zealand SP (1)* 0.11 
Southern California Bight TE (14)* (14)* (14)* 0.81 0.23 0.97 
Southern Caribbean TR 14 (0) 1.00 
Southern Cook/Austral Islands TR 15 (0) 0.01 
Southwestern Caribbean TR 22 (0) 1.00 
Three Kings-North Cape TE 6 (0) 0.72 
Tonga Islands TR 31 (0) 0.07 
Torres Strait Northern Great Barrier Reef TR 26 (0) 1.00 
Tuamotus TR 53 (0) 0.97 
Tweed-Moreton ST 39 (0) 0.54 
Vanuatu TR 1(0) 0.01 
Western Bassian TE-SP 10 (0) ) 0.07 
Western Galapagos Islands ST 30 (1) ) 1.00 
Western Mediterranean TE 28 (0) 26 (2) 26 (2) 6.29 0.04 0.14 
Western Sumatra TR 30 (0) 27 (0) 30 0.61 0.13 1.00 


entifies whether fauna surveye 
temperate-subpolar transition (TE-SP) on the basis of CTI. 


ARTICLE 


The number of sites used in figures is the number of sites with confidence >2.5, with number of sites with confidence <2.5 shown in brackets. An asterisk indicates that sites with confidence <2.5 were 
included in calculations of ecoregion means. Group i 


at sites within the ecoregion can be classified as temperate (TE), tropical (TR), subtropical (ST), subpolar (SP), and 
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Extended Data Table 2 | GAMM results 


coefficient standard error t-value P-value 


2025 

Intercept 0.080 0.137 0.586 0.558 
Warming rate -0.138 0.592 -0.233 0.816 
TBiaSmax <0.001* 
2115 

Intercept 0.204 0.137 1.490 0.136 
Warming rate 1.180 0.591 1.999 0.046* 
TBiaSmax <0.001* 


Results for Fig. 3b and d. Proportion of species loss predicted by 2025 and 2115 as a function of warming rate and TBiaSmax. N= 2,091. 
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Brain tumour cells interconnect to a 
functional and resistant network 


Matthias Osswald!”, Erik Jung!*, Felix Sahm?**, Gergely Solecki!*, Varun Venkataramani”, Jonas Blaes!?, Sophie Weil!, 


Heinz Horstmann®, Benedikt Wiestler!?°, Mustafa Syed!?, Lulu 
Felix T. Kurz®, Torsten Schmenger!”, Dieter Lemke!*, Miriam G 


Huang!”, Miriam Ratliff?’, Kianush Karimian Jazi!?, 
ommel!?, Martin Pauli’, Yunxiang Liao!*, Peter Haring’, 


Stefan Pusch*, Verena Herl", Christian Steinhauser", Damir Krunic”, Mostafa Jarahian'’, Hrvoje Miletic!*, Anna S. Berghoff", 
Oliver Griesbeck", Georgios Kalamakis"’, Olga Garaschuk!”, Matthias Preusser!*, Samuel Weiss!?,?°!, Haikun Liu’, 
Sabine Heiland®, Michael Platten)’, Peter E. Huber?+*°, Thomas Kuner®, Andreas von Deimling**, Wolfgang Wick!? & 


Frank Winkler!? 


Astrocytic brain tumours, including glioblastomas, are incurable neoplasms characterized by diffusely infiltrative growth. 


Here we show that many tumour cells in astrocytomas exte 


nd ultra-long membrane protrusions, and use these distinct 


tumour microtubes as routes for brain invasion, proliferation, and to interconnect over long distances. The resulting 
network allows multicellular communication through microtube-associated gap junctions. When damage to the network 
occurred, tumour microtubes were used for repair. Moreover, the microtube- connected astrocytoma cells, but not those 
remaining unconnected throughout tumour progression, were protected from cell death inflicted by radiotherapy. The 
neuronal growth-associated protein 43 was important for microtube formation and function, and drove microtube- 


dependent tumour cell invasion, proliferation, interconne 


ction, and radioresistance. Oligodendroglial brain tumours 


were deficient in this mechanism. In summary, astrocytomas can develop functional multicellular network structures. 
Disconnection of astrocytoma cells by targeting their tumour microtubes emerges as a new principle to reduce the 


treatment resistance of this disease. 


Astrocytomas (World Health Organisation grades II, III and IV; grade 
IV are called glioblastomas) are prototypical examples for highly inva- 
sive tumours that diffusely colonize their host organ’, which ultimately 
leads to neurological dysfunction and death despite intensive radio- 
and chemotherapy. Oligodendroglioma is another glioma type that 
shares many molecular features like frequent isocitrate dehydrogenase 
(IDH1 and IDH2) mutations’, but is less invasive and far more vul- 
nerable to therapeutic intervention than astrocytomas. A codeletion 
of the chromosomal parts 1p and 19q is characteristic for oligoden- 
drogliomas, but absent in astrocytomas*~*. This codeletion allows for 
molecular subgrouping”“, and is associated with a high responsiveness 
of oligodendrogliomas to radiochemotherapy, leading to marked long- 
term survival benefits”*®. The reason for that remained unclear, just as 
the specific mechanism(s) of resistance in 1p/19q intact astrocytomas. 

Here we describe the discovery of a direct anatomical connection 
between astrocytoma cells, with relevance for tumour functionality and 
resistance. Similar cell-cell connections by membrane tubes have been 


first described in Drosophila development’. They can play a role in the 
transport of organelles and proteins®”, spread of infectious particles”, 
stem cell signalling", and functional cell-cell coupling'?"“. Studied 
mostly in vitro so far, these tubes have received many names, including 
membrane nanotubes, tunnelling nanotubes, or cytonemes. However, 
the exact function(s) of membrane tube connections in mammalian 
tissues and in tumour biology remained unresolved'». 


Membrane tubes in glioma progression 

To study the occurrence and dynamics of membrane tube protru- 
sions in mammalian tumours, we followed gliomas growing in the 
mouse brain by in vivo multiphoton laser-scanning microscopy 
(MPLSM) down to a depth of 750 um (ref. 16), for up to one year. 
After transplantation of patient-derived glioblastoma cell lines (n = 6) 
that were kept under serum-free, stem-like conditions!” (GBMSCs; 
non-codeleted for 1p/19q, and IDH wild-type; Extended Data 
Fig. la-k), many tumour cells formed ultra-long cellular protrusions. 
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Figure 1 | Distinct membrane microtubes of brain tumour cells. 

a, In vivo MPLSM maximum intensity projections (MIPs) of $24 

GBMSCs growing in the mouse brain over 60 days (D). Arrows, thin 
cellular protrusions extending into the normal brain; arrowheads, long 
intratumoral protrusions. b, Travel of nuclei (arrows, arrowheads) 

after nuclear division (at 23 and 103h) in cellular protrusions of $24 
GBMSCs. 3D images. ¢, Protrusions are actin-rich, and interconnect 
single tumour cells (arrowheads, S24 Lifeact-YFP, MIP). d, Scanning 
electron microscopy (SEM) image of a S24 GBMSC membrane microtube 
(arrow, identified by GFP photo-oxidation) in the mouse brain. Asterisks, 
axons. e, 3D rendering (z dimension 90 um) of membrane microtubes 
interconnecting single S24 GBMSCs. Intercellular (connecting) and 
non-connecting tubes, and connected and unconnected tumour cells are 
shown colour-coded. f, Number of membrane microtubes per $24 tumour 
cell that connect this cell to another tumour cell (n = 141-437 cells in 
n=3 mice). g, h, Representative confocal IDH1R132H mutation-specific 
immunofluorescence images of a human astrocytoma grade II (AH, g) and 
oligodendroglioma grade III (OIII, h). i, Maximum length of IDH1®!°7# - 
positive microtubular structures in human oligodendrogliomas (O, 1p/19q 
codeleted) grade IJ, II; and astrocytomas (A, 1p/19q non-codeleted) grade 
II, HI and IV (glioblastoma, GBM); n = 20-24 patients per tumour entity, 
n= 105 total. a—c and e-f, in vivo MPLSM. Error bars show s.d. 


These protrusions infiltrated the normal brain at the invasive front 
(Fig. 1a), where astrocytoma cells extended and retracted them in a 
scanning mode (Supplementary Video 1). Protrusion tips were highly 
dynamic (Extended Data Fig. 11), similar to neuronal growth cones 
during development!®. When tumours progressed, the number of cel- 
lular protrusions increased further, some exceeding 500 um in length 
(Extended Data Fig. 2a). The resulting membrane tubes where used as 
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tracks for travel of cell nuclei, for example, after mitosis (Fig. 1b; speed 
of travelling nuclei: 66.42 + 36.25 um per day, n= 16 nuclei inn =6 
mice.). These data suggest that membrane tube formation is a novel 
means of tumour dissemination, adding to the known strategies”. 

All membrane tubes were actin-rich (Fig. 1c), which is also typical 
for most membrane nanotubes. Moreover, live imaging and immuno- 
histochemistry revealed that they were indeed enclosed by a contin- 
uous cell membrane, and positive for myosin IIa, microtubules, and 
protein disulphide isomerase; partly positive for b-catenin, 6-parvin, 
and the astrocytic marker GFAP; but largely negative for N-cadherin, 
myosin X, and the neuronal marker $-tubulin III (Extended Data 
Fig. 2b, c). Together, these data indicate that these membrane tubes 
have a unique composition and a potent motility machinery. Dendritic 
arborization was frequent, with more dynamic thin membrane tubes 
originating from more stable, thicker ones (Fig. 1c, Extended Data 
Fig. 3a, b). To allow ultrastructural analysis of these thin tumour 
cell-derived tubes in the mouse brain by electron microscopy, pho- 
to-oxidation of brain sections was performed. This resulted in dark 
precipitates within the green fluorescent protein (GFP)-expressing 
astrocytoma cell tubes (Extended Data Fig. 3c). Serial-section scan- 
ning electron microscopy (3D SEM) revealed that the cell membrane- 
enclosed tubes had a mean cross-sectional area of 1.57 + 0.33 um? 
(n=6, Extended Data Fig. 3d), and contained mitochondria and 
microvesicles (Fig. 1d), suggesting that there is local ATP production 
and vesicle trafficking in the tubes. Interestingly, mitochondria travelled 
quickly in these tubes (Extended Data Fig. 3e). Furthermore, a relevant 
number of membrane microtubes were following axons in the brain 
(19.6% of n=51; Fig. 1d, Extended Data Fig. 3f), which are known 
leading structures for tumour cell dissemination in astrocytomas””. 

In vivo imaging of membrane tube development over time revealed 
that an increasing number started at one and ended at another astro- 
cytoma cell, creating a multicellular anatomical network (Fig. Le, f; 
Supplementary Video 2a, b). Abundant intercellular membrane tubes 
were also found in a genetic astrocytoma model?! (Extended Data 
Fig. 3g, Supplementary Video 2c). Intercellular membrane tubes were 
in parta result of cell division, with enduring stable contact of daughter 
cells over long distances (Fig. 1b), but also of mating of non-related 
astrocytoma cells (Extended Data Fig. 4a—g). A small proportion of 
membrane tube-bearing astrocytoma cells maintained quiescent for 
months, often in a perivascular niche that has been associated with 
glioma cell stemness” (Extended Data Fig. 4h, i). 

The intercellular position of many astrocytoma membrane tubes, 
together with their high content of F-actin, is reminiscent of mem- 
brane nanotubes!*; however, the membrane nanotubes reported so 
far had a width of below 1 um; a length of usually tens, rarely a few 
hundreds of ttm; and documented life time of less than 60 min. These 
differences led us to propose the new term “tumour microtubes’, or 
TMs, for the discovered ultra-long, long-lived, and thicker membrane 
extensions of astrocytoma cells. 


TMs characterize human astrocytomas 

To investigate whether TMs are also characteristic for human brain 
tumours, we stained resected WHO grade II-IV gliomas with 
IDH1*!374 mutations using a mutation-specific antibody”>. This 
allowed us to unambiguously detect tumour-cell-derived membrane 
tubes in the filament-rich brain parenchyma. Like in the astrocytoma 
mouse models, TMs were abundant in patient tissue (Fig. 1g): 63% of 
astrocytoma cells had intercellular TMs (n = 196 cells in 100 um thick 
sections of n=8 WHO grade II-III tumours without 1p/19q codele- 
tion; Supplementary Video 3). In contrast, only 0.7% of oligodendro- 
glioma cells in human tumour samples had intercellular TMs (Fig. 1h; 
n=150 cells from n= 3 oligodendrogliomas with 1p/19q codeletion), 
and TMs were also rare in patient-derived oligodendroglioma cells that 
formed tumours in mice (Extended Data Fig. 5a, b). Further analysis of 
105 human gliomas revealed that TM formation was highly influenced 
by tumour type and grade, with a marked positive correlation of TM 
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Figure 2 | TM-connections allow communication in multicellular 
networks. a, Time-lapse image of calcium waves (Rhod2-AM) travelling 
along TMs of GBMSCs (bidirectional; arrows). Red arrow, crossing of 

two TMs, with simultaneous calcium peak. b, Calcium transients (A F/Fo, 
Rhod-2AM) of non-connected GMBSCs (grey) versus TM-connected cells 
(blue); broken lines mark synchronous calcium transients. c, Synchronicity 
(see Methods section) of calcium peaks in GBMSCs, shown for 
TM-connected versus non-TM-connected tumour cells (n = 40 versus 

43 cells in n=3 mice per condition; t-tests). d, Representative heat map of 
calcium transients between GBMSCs. e, Synchronicity of calcium peaks 
during brain superfusion with extracellular saline (ES, control) versus 

100 uM carbenoxolone (CBX) in GBMSCs (blue box) and normal brain 
astrocytes (red box); n =3 mice per group; t-tests. f, SR101 uptake ina 
non-TM-connected GBMSC (upper image) and a TM-connected one 
(lower image). Right, corresponding quantification (AU, arbitrary units); 
n=55 cells in n=3 mice per condition; Mann-Whitney test. All images 
and analyses, in vivo MPLSM. GBMSCs, 824 line. Error bars show s.d. 
*P<0.05, ***P< 0.001. 


length and unfavourable prognosis. For example, bona fide TMs of 
a minimum of 50m length in standard thin sections were detect- 
able in only 19% of WHO grade II oligodendrogliomas, but 93% of 
WHO grade IV astrocytomas (= glioblastomas) (Fig. li). In astrocy- 
tomas, TMs were even detected in the contralateral brain hemisphere 
(Extended Data Fig. 5c), and also in IDH wild-type tumours (Extended 
Data Fig. 5d). The 1p/19q status better predicted TM occurrence than 
morphological glioma classification (Extended Data Fig. 5e-g). 


A communicating network 

Intercellular calcium waves (ICWs) can coordinate the activity of 
individual cells in multicellular networks, which includes astrocytes 
of the normal brain?*”°, neurons”, and radial glia cells during cen- 
tral nervous system development”’. We observed extensive and long- 
range ICWs, involving many astrocytoma cells in various tumour 
regions (Supplementary Video 4). ICWs were propagated along TMs 
in both directions (Fig. 2a, Extended Data Fig. 6a). Further analysis 
confirmed that ICWs, measured by synchronicity of calcium fluc- 
tuations, were largely restricted to astrocytoma cells with detectable 
TM connections (Fig. 2b, c; Extended Data Fig. 6b), allowing com- 
munication of individual cells in a reproducible pattern (Fig. 2b, d; 
Extended Data Fig. 6c, d). 
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Figure 3 | Connexin 43 gap junctions connect TMs. a, Quantification 

of Cx43 protein expression detected by immunohistochemistry in 1p/19 
non-codeleted (IDH wild-type and IDH mutated) versus codeleted human 
gliomas (n = 8 each, ANOVA, Tukey’s post-hoc test). b, Nestin and Cx43 
double-immunofluorescence in $24 and Tl GBMSC tumours (arrows: 
Cx43-positive TM-interconnections). c, SEM image of a direct membrane 
contact of 2 TMs in the mouse brain. d, Synchronicity of calcium peaks in 
$24 shControl versus shCx43 cells (n = 3 mice/condition; Mann-Whitney 
test). e, T2 MRI images of shControl versus shCx43 tumours, 72 days after 
implantation. Quantifications of n = 6 animals per group (t-test). d, In vivo 
MPLSM. Error bars show s.d. *P < 0.05; ***P < 0.001. 


For membrane nanotubes, intercellular connections have been 
reported in vitro to either be open ended®, or separated by gap junc- 
tions'4, the latter as a prerequisite for ICW propagation". Indeed, phar- 
macological gap junction blockade reduced the frequency (Extended 
Data Fig. 6e) and synchronicity (Fig. 2e) of TM-mediated ICWs in 
astrocytoma cells in vivo, but not in co-registered brain astrocytes 
of that tumour region. Inhibition of inositol triphosphate, which is 
gap-junction-permeable and the effector of gap-junction-mediated 
ICW propagation”® also reduced ICWs between astrocytoma cells 
(Extended Data Fig. 6f). 

The functional connection of single astrocytoma cells via 
TM-associated gap junctions was verified by rapid distribution of 
the gap junction-permeable dye sulforhodamine 101 (ref. 28) in the 
TM-connected cellular tumour cell network in vivo after local injec- 
tion (Fig. 2f), which was inhibited by gap junction blockade (Extended 
Data Fig. 5g). Further experiments confirmed that another gap 
junction-permeable molecule was transferred between TM-connected 
cells (Extended Data Fig. 6h-j), while gap junction-impermeable large 
molecules were not (Extended Data Fig. 4c). 


Connexin 43 connects TMs 

To identify which of the known gap-junction-forming human connexins 
is involved in TM-mediated cell-to-cell communication in astrocyto- 
mas, we hypothesized that the deficiency of 1p/19q codeleted gliomas 
for intercellular TMs might also result in lower expression of the rele- 
vant TM-associated connexin(s). Analysis of 250 glioma samples of the 
TCGA data set revealed a list of differentially expressed genes between 
1p/19q codeleted versus non-codeleted human tumours. Of the 20 con- 
nexins for which reads were mapped, only connexin 43 (also known 
as Cx43, GJA1) was differentially expressed, and found to be among 
the top 100 upregulated genes in 1p/19q non-codeleted tumours, both 
in IDH mutated and wild-type ones (Supplementary Table 1). This 
was confirmed in patient tumour tissue (Fig. 3a), and also in the pri- 
mary glioma cell lines (Extended Data Fig. 6k). Confocal microscopy 
revealed punctate Cx43 immunoreactivity particularly at the TMs of 
astrocytoma cells (Fig. 3b), which was not seen for other connexins 
(Extended Data Fig. 61). Remarkably, Cx43 immunoreactivity fre- 
quently located at the place where two different TMs crossed each 
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Figure 4 | TM-connected astrocytoma cell networks can repair 
themselves, and resist radiotherapy. a, Time series of exemplary 3D 
images (541m thickness) of a tumour region after laser-induced killing 

of a GMBSC (circle). Over time a TM (arrowheads) is extended, and a 
nucleus (asterisk) translocates via that TM to the place the killed cell had 
been located. b, Time course after laser-induced photodamage (dotted 
area), arrows: GBMSC TMs extending into the photodamaged region. MIP 
of 21 um, n= 3 mice. c, GBMSC tumour microregions (3D images, 20 um 
depth) before start of radiation (D0), and 7 days later (D7). Asterisks, 


other (Fig. 3b). Contact sites of individual TMs with direct membrane- 
membrane contact were also detected with 3D SEM (Fig. 3c). Further 
analysis of our ICW data sets revealed that calcium waves can propagate 
via those crossings from one TM to another (Fig. 2a, red arrow). 

To investigate the functional role of the Cx43 gap junction protein 
in astrocytoma progression, a stable short hairpin RNA knockdown 
of Cx43 was performed in GBMSCs. This significantly reduced the 
synchronicity of ICWs in vivo (Fig. 3d), and also the proportion of 
astrocytoma cells with multiple TMs at later time points (Extended 
Data Fig. 6m), which suggests a role for Cx43 gap-junction-mediated 
communication in long-term stabilization of TMs. In accordance 
with the proposed role of functional TMs for tumour progression, 
Cx43 deficiency resulted in reduced tumour size as observed by MRI 
(Fig. 3e) and improved survival (Extended Data Fig. 6n). 


Aself-repairing and resistant network 

To investigate the role of TMs in damage repair in vivo, selective abla- 
tion of single GBMSCs was performed by applying a fatal laser dose 
to a fraction of their nuclear volume (1 um? ). If the ablated cell was a 
prior member of the TM-connected network, new TMs were extended 
towards the dead cell, and within a few days a new nucleus advanced 
via those TMs to the location of the prior cell (Fig. 4a; n =8 reconsti- 
tution events in 8 photodamaged tumour cells from m=3 animals). If 
a non-TM-connected GBMSC was ablated, such repair mechanism 
was only infrequently observed (2/8 events in n = 3 animals; P< 0.01, 
Fisher's exact test). Photon damage to a larger volume consisting of 
6-10 GBMSCs and normal brain parenchyma resulted in rapid exten- 
sion of TMs of neighbouring GBMSCs into this area, followed by a 
marked increase in tumour cell density specifically in the damaged 
volume (Fig. 4b). 

Next we investigated whether TM-connected tumour cell networks 
were also resistant against the cytotoxic effects of radiation therapy, 
a standard treatment of gliomas. While TM-connected cells were 
largely protected from cell death, unconnected tumour cells died in 
relevant numbers after radiotherapy (Fig. 4c, Extended Data Fig. 7a, b). 
Furthermore, TM-connected astrocytoma cells increased both their TM 
number (Fig. 4d) and their calcium communication (Fig. 4e, Extended 
Data Fig. 7c) as a reaction to radiotherapy. Concordantly, Cx43 knock- 
down reduced the radioprotective effect of TM interconnections, while 
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non-TM-connected GBMSCs; arrow, one exemplary cell with many TMs. 
Graph, relative change of cellular subtypes under sham radiation (Sham), 
or treated with radiation at 3 x 7 Gy (Rad) (n =3 mice per group, t-tests). 
d, Number of TMs per cell in these tumours (3D images left side, n = 50 
cells per time point in nm =3 mice per group, Mann-Whitney test). 

e, Synchronicity of calcium transients after sham treatment or 
radiotherapy in individual GBMSC tumour regions (Fluo-4AM, n=3 
mice per group, Mann-Whitney test). All images, in vivo MPLSM. 
GBMSCs, 824 line. Error bars show s.d. *P< 0.05, ***P< 0.001. 


non-TM-connected astrocytoma cells regressed like in control tumours 
(Extended Data Fig. 7d). 

To explore potential mechanisms of TM-mediated protection 
from cytotoxicity, we measured basal intracellular calcium levels in 
astrocytoma cells before and during radiotherapy, using a ratiomet- 
ric calcium indicator. Basal calcium levels were very homogeneous in 
non-irradiated cells, and also in TM-connected cells during radiother- 
apy, while unconnected cells developed a high variability of their intra- 
cellular calcium levels during irradiation (Extended Data Fig. 7e-h). 


Drivers of TM formation 

Next we sought to identify the crucial molecular pathways that 
drive the formation of TMs to better understand their nature, and 
to substantiate their role for tumour progression and resistance. For 
this purpose, we first analysed the in silico data set of 1p/19q non- 
codeleted versus codeleted human gliomas (Supplementary Table 1) 
by using Ingenuity Pathway Analysis. Here, biological functions that 
were prominently activated in 1p/19q non-codeleted astrocytomas 
included “cellular movement” and “cell-to-cell signalling and inter- 
action’, supporting the proposed function of TMs in these tumours 
(Extended Data Fig. 8a, b for JDH mutant astrocytomas; similar 
results for IDH wild-type astrocytomas, data not shown). Intriguingly, 
we found many canonical pathways involved in the outgrowth of neu- 
rites, and neurite-like membrane protrusions to be more activated 
in 1p/19q non-codeleted gliomas, including integrin’, phospho- 
lipase C*", Rho family GTPases*!, HMGB1~, and also the proto- 
typical neurotrophin/TRK signalling pathways*° (Extended Data 
Fig. 8c; confirmed in IDH wild-type astrocytomas, data not shown). 
The latter was confirmed at the protein level in human gliomas, 
where the neurotrophins NGF (located on 1p) and NT-4 (19q) were 
downregulated in 1p/19q codeleted tumours, and also their respec- 
tive membrane receptors TrkA and TrkB, which has been described 
before* (Extended Data Fig. 9a). 

When considering these results and reviewing the literature for 
known downstream effectors particularly relevant for the formation 
of neurite-like membrane protrusions, the growth-associated pro- 
tein GAP-43 came into focus. GAP-43 is highly expressed in axonal 
growth cones***», induced by neurotrophin receptor signalling***’, 
and drives neuronal progenitor cell migration*®. Remarkably, GAP-43 
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Figure 5 | GAP-43 is required for TM outgrowth and function. 

a, GAP-43 protein expression (immunohistochemistry) in 1p/19 non- 
codeleted (IDH wild-type and IDH mutated) versus codeleted gliomas 
(n= 8 each) (ANOVA on the ranks, Student-Newman-Keuls post hoc 
test). b, Immunocytological images of GAP-43 protein, with preferential 
GAP-43 localization at the tip of TMs. c, Single-plane images of control 
and shGAP-43 knockdown GBMSC S24 tumours; right, corresponding 
quantification (n =3 mice per condition; Mann-Whitney tests). 

d, Synchronicity analysis of calcium peaks in $24 shControl versus 
shGAP-43 cells in vivo (n= 3 mice; Mann-Whitney test). e, Time 


overexpression is sufficient for the outgrowth of membrane tubes in 
neuronal* and even in non-neuronal* cells. Indeed, GAP-43 was sig- 
nificantly higher expressed in 1p/19q non-codeleted human gliomas 
(Fig. 5a) and primary stem-like cell lines (Extended Data Fig. 9b, c) 
when compared with 1p/19q codeleted ones. Of note, GAP-43 prefer- 
entially localized to the cone-like and nestin-negative tips of sprouting 
TMs (Fig. 5b), similar to its known enrichment in the nerve growth 
cone*’. 

To interfere with TM formation during astrocytoma progression, 
we engineered GBMSCs with a genetic knockdown of GAP-43. While 
in vitro viability of these cells was not affected (data not shown), their 
TMs in vivo were structurally abnormal, with reduced branching 
(Extended Data Fig. 9d). This was associated with an impaired dis- 
semination of tumour cells (Fig. 5c, Extended Data Fig. 9e), result- 
ing from both their decreased mean invasion speed (Extended Data 
Fig. 9f) and proliferation capacity (Extended Data Fig. 9g) in the mouse 
brain. Importantly, intercellular TM connections (Extended Data 
Fig. 9h) and ICWs (Fig. 5d) were reduced in GAP-43 deficient tumours, 
which was accompanied by a selective reduction of Cx43 gap junc- 
tion protein expression (Extended Data Fig. 9i). These deficiencies in 
TM-mediated features of tumour progression lead to a marked reduc- 
tion of tumour size in the mouse brain (Extended Data Fig. 9j), and to 
an improved survival of the animals (Extended Data Fig. 9k). When 
radiotherapy was applied, GAP-43 deficiency resulted in an increased 
regression of tumour cells, as revealed by repetitive in vivo MPLSM 
(Fig. 5e). This was confirmed by MRIs 60 days after radiation, where 
no relevant tumour-derived signal changes were detectable in shGap43 
tumour bearing animals, while control tumours were large, causing 
neurological symptoms in mice (Fig. 5f). Histological analysis at this 
time point confirmed only small remnants of proliferation-deficient 
tumour cells in GAP-43 knockdown tumours (Extended Data Fig. 91). 

Finally, we overexpressed the GAP-43 protein in 1p/19q codeleted 
primary oligodendroglioma cells, to achieve protein levels compa- 
rable to the 1p/19q non-codeleted GBMSC lines (Extended Data 
Fig. 9m). This lead to a morphological shift to a TM-rich, thus astro- 
cytoma-like phenotype of GAP-43 overexpressing oligodendroglioma 
cells (Fig. 5g, Extended Data Fig. 9n, 0). Remarkably, the induction 
of TM formation in these tumours resulted in an increase in tumour 
cell invasion into the brain (Fig. 5g, Extended Data Fig. 9p), and also 


course after irradiation of $24 shControl versus shGAP-43 tumours, 
corresponding quantifications (n = 3 mice per group; t-tests). f, MRI 
images of $24 shControl versus shGAP-43 tumours, 60 days after radiation 
(115 days after tumour implantation); right, quantifications of 5-6 animals 
per group (t-test). g, GAP-43 overexpression in BT088 oligodendroglial 
stem-like cell lines (OSCs) versus controls, 14 days after injection. 

h, Relative change of tumour volumes 21 days after radiotherapy in BT088 
vector-control versus GAP-43 overexpression tumours (1 = 3 mice per 
group; t-test). ce, g, h, In vivo MPLSM. Error bars show s.d. * P< 0.05, 
**P<0.01, ***P< 0.001. 


in an increase in radioresistance (Fig. 5h); both were comparably low 
in control oligodendrogliomas. 


Conclusions 

The ability to interconnect via ultra-long and highly functional TMs is 
an important mechanism of progression and resistance in astrocyto- 
mas, and depends on molecular pathways that are active when 1p/19q 
is intact. The resulting multicellular network is able to communicate via 
Cx43 gap junction connections. Multiple functions have been reported 
previously for different connexins in glioma pathology*!”. This 
includes Cx43, which is also highly expressed in non-malignant astro- 
cytes, connecting them to functional and resistant* cellular networks. 
Here we provide one possible mechanism for this resistance: mainte- 
nance of calcium homeostasis by network integration. Since increases 
of intracellular calcium levels are required for radiotherapy-induced 
cytotoxicity“, and even small calcium increases are involved in intrinsic 
apoptotic cell death in glioma cells*’, one can assume that intercellular 
TMs can serve as a means for an individual tumour cell to distribute 
critical elevations of small molecules like calcium within the larger 
network, achieving nonlethal levels. 

The data presented here support the notion that tumours are complex 
organs, which has so far been attributed to the supportive contribution 
of non-malignant cell types**, including neurons in brain tumours*”. 
Our study adds to this concept by demonstrating that single cancer cells 
within one tumour communicate and cooperate with each other in a 
complex but ordered manner that is by itself reminiscent of a functional 
organ. It has become clear that tumours can hijack programs that are 
part of normal tissue development”. The key finding of this study is 
that TMs, which are generated by GAP-43 just as axons are in neu- 
rons, allow efficient tumour progression, network communication, and 
resistance to adverse events (Extended Data Fig. 10). Thus we anticipate 
that pharmacological targeting of TM formation and function will open 
new therapeutical avenues for treatment-resistant brain tumours. 
Online Content Methods, along with any additional Extended Data display items and 
Source Data, are available in the online version of the paper; references unique to 
these sections appear only in the online paper. 
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METHODS 


Animals, and surgical procedures. 8-10 weeks old male NMRI nude mice were 
used for all studies with human primary brain tumour cells. As a syngeneic astro- 
cytoma mouse model we used Nestin-Tv-a;Tlx-GFP mice in combination with 
RCAS-PDGEFB/AKT vectors”?. All animal procedures were performed in accord- 
ance with the institutional laboratory animal research guidelines after approval 
of the Regierungsprasidium Karlsruhe, Germany (governmental authority). All 
efforts were made to minimize animal suffering and to reduce the number of 
animals used. Mice were clinically scored and if they showed marked neurological 
symptoms or weight loss of >20%, experiments were terminated. In none of the 
experiments these limits were exceeded. No maximum tumour size was defined 
for the invasive brain tumour models. 

Cranial window implantation in mice was done in a modification of what has 
been previously described”, including a custom-made titanium ring for painless 
head fixation during imaging. 

2-3 weeks after cranial window implantation, 30,000 tumour cells were stereo- 
tactically injected into the mouse brain at a depth of 500 um. For survival experi- 
ments we injected 50,000 tumour cells. In a subgroup of mice, a short plastic tube 
was glued under the glass, with one end inside and one outside, which allowed 
topical application of different substances under the window without the need 
to re-open it. 

For intratumoral microinjection of sulforhodamine 101 (SR101, Molecular 
Probes, S-359), 50 nl of 100 4M SR101 (with or without 100 uM CBX, Sigma- 
Aldrich, C4790) were injected with a very thin glass pipette into tumour regions 
of similar cellular densities, >90 days after tumour injection. 

Radiation treatment. Established tumours were irradiated with 7 Gy on three 
consecutive days (total dose 21 Gy) in regions matching in tumour cell density 
using a 6 MV linear accelerator with a 6 mm collimator (adjusted to the window 
size) at a dose rate of 3 Gy min | (Artiste, Siemens), or no radiation was applied 
(sham radiation), at day 60 (+10 days) after tumour implantation. For MRI stud- 
ies, a total brain radiation with the same dose and a field size of 17 mm x 250mm 
(allowing the irradiation of several mice) was used. The used radiation schedule 
is in the range of the commonly prescribed 60 Gy in 2 Gy fractions for malignant 
glioma patients, assuming an a/6 of ~ 10 in the linear quadratic model and taking 
into account the radiation time of 3 days. 

In vivo multiphoton laser scanning microscopy (MPLSM). MPLSM imaging was 
done with a Zeiss 7MP microscope (Zeiss) equipped with a Coherent Chameleon 
Ultrall laser (Coherent). The following wavelengths were used for excitation: 
750nm (dsRed, FITC-dextrane, tdTomato), 840 nm (Fluo-4AM), 850 nm (GFP, 
TRITC-dextrane, Rhod-2AM), 860 nm (CFP, for Forster Resonance Energy 
Transfer (FRET) imaging) and 950 nm (tdTomato, YFP). Appropriate filter sets 
(band pass 500-550 nm/band pass 575-610 nm and band pass 460-500 nm/band 
pass 525-560 nm (for FRET)) were used. Standard settings for imaging were gains 
between 650 and 750 nm (depending on the depth, the fluorescence intensity of 
the fluorophore and the window quality), and a z-interval of 3 um. Laser power 
was tuned as low as possible. 

The body temperature of mice was kept constant using a rectal thermometer and 
a heating pad. Isoflurane concentration (in 100% O2) was chosen as low as possible 
(0.5-1.5%) to avoid interference with the calcium communication between astro- 
cytoma cells. Fluorescent dextranes (FITC (2M MW)- or TRITC (500.000 MW)- 
conjugated, 10 mg ml‘, Sigma) were injected intravenously to obtain angiograms. 

For in vivo ablation of single astrocytoma cells, only the volume of the GFP- 
labelled cell nucleus was exposed to continuous scanning with a high power 
laser beam until disintegration of the nucleus became visible. To investigate the 
reaction of TMs after the photodamage of a wider brain region, a larger volume 
(0.5-1 x 10° um*) was scanned repetitively for approximately 8 min with high 
power, resulting in a total photon dose that was >50 times higher than during 
“diagnostic” imaging. 

In vivo calcium imaging with MPLSM. The following small molecule calcium 
indicators were applied to the brain surface for 45 min: for GFP-transfected tumour 
cells, 2mM Rhod-2AM (Life Technologies, R-1244); for RFP-transfected, 2mM 
Fluo-4AM (Life Technologies, F-14201). Pharmacological gap junction inhibition 
was achieved by superfusion with the inhibitor CBX (100 uM; control substance: 
extracellular saline; n = 3 mice per group). Other superfused substances were 
suramin (100M, ATP antagonist) and 2-aminoethoxydiphenyl borate (2-APB, 
100 uM, inhibitor of inositol triphosphate receptors). Two genetically encoded 
calcium indicators were lentivirally transduced to GBMSCs: the Lck-GCaMP3 
sensor in the rrl-CAG-IGC3 vector (CAG promoter to control expression of DsRed 
and the Ca?" sensor that monitors near-membrane changes in [Ca?*];°°). The 
ratiometric calcium sensor Twitch-3 was used to determine intracellular calcium 
concentrations by FRET as previously described”. 

MRI studies. MRI images were obtained at day 72 after tumour implantation 
for non-irradiated animals, and at day 115 for irradiated mice (60 days after 
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radiotherapy; time points were chosen when first control animals developed 
neurological symptoms and/or lost 20% weight, and had to be euthanized). All 
scans were performed on a 9.4 T horizontal bore MR scanner (BioSpec 94/20 
USR, Bruker BioSpin GmbH) with a four channel phased array surface coil. 
A T2-weighted rapid acquisition with refocused echoes (RARE) sequence was 
acquired to determine tumour volume. 

Cell lines and cell culture experiments. Tumour cell lines derived from resected 
glioblastomas were cultivated in DMEM-F12 under serum-free non-adherent, 
‘stem-like’ conditions, including B27 supplement (12587-010, Gibco), insulin, 
heparin, epidermal growth factor and fibroblast growth factor!”*? (GBMSCs: 
P3, $24, T1, T269, T325, WJ). These 6 GBMSC lines were selected because they 
were capable of growing to tumours in mouse brains; all were non-codeleted for 
1p/19q, and IDH wild-type. Two oligodendroglioma cell lines harbouring the 
typical 1p/19q codeletion (BT088 and BT054) were kept under the same cell cul- 
ture conditions**. Of note, BT054 is IDH1*!*7" mutated, while BT088 has lost the 
heterozygous IDH1*!“74 mutation of the patient tumour it was derived from, but 
still maintains its GCIMP phenotype (data not shown). Typical genetic changes 
of glioblastoma were confirmed for $24 using comparative genomic hybridization 
(CGH, see Extended Data Fig. 1i); the T1, T269, T325 and WJ lines had been 
characterized before, as well as the P3 line**. Cells were regularly checked for 
mycoplasma infections and authenticity (species control). 

Tumour cells were transduced with lentiviral vectors for multicolour imag- 
ing. For cytosolic GFP expression, we used the pLKO.1-puro-CMV-TurboGFP_ 
shnon-target-vector (SHC016 Sigma Aldrich), for cytosolic RFP (tdTomato) 
expression the LeGo-T2 vector (gift from A. Trumpp), and nuclear GFP expres- 
sion (H2B-GFP) was achieved by transduction with pLKO.1-LV-GFP (Addgene 
25999, Elaine Fuchs). Transduction with pLenti6.2 hygro/V5-Lifeact- YFP made 
it possible to image the in vivo dynamics of actin filaments, FUmGW (Addgene 
22479, Connie Cepko) allowed in vivo illustration of cell membranes. Microtubuli 
were marked using the LentiBrite GFP-Tubulin Lentiviral Biosensor (17-10206, 
Merck Millipore). Lentiviral particles were produced as described before®*. For 
in vivo tracking of Myosin II, a plasmid transfection with FuGENE HD (Promega) 
was performed with the Myosin-IIA-GFP vector (Addgene 38297, Matthew 
Krummel). 

Production of lentiviral knockdowns of Cx43 (pLKO1.1-puro-CMV-tGFP- 
vector, Sigma Aldrich, target sequence: GCCCAAACTGATGGTGTCAAT) 
and GAP-43 (pLKO1.1-puro-CMV-vector, Sigma Aldrich, target sequence: 
TGTAGATGAAACCAAACCTAA) by shRNA technology was carried out 
as described before. Control cells were infected with the appropriate non_ 
target shRNA-lentiviral particles (SHC016, Sigma Aldrich). For overexpression 
of GAP-43, the open reading frame of GAP-43 was cloned into the pCCL.PPT. 
SFFV.MCS.IRES.eGFP.WPRE-vector backbone. Lentiviral particle production 
and transduction of target cells was done as described before’. 

Tumour cells were incubated with the harvested virus and 8 mg ml ' polybrene 
(Merck Millipore) for 24h. Quantification revealed a 80% protein knockdown for 
Cx43 and a 92.5% for GAP-43 (Western Blot analyses). If necessary, tumour cells 
were selected for the fluorophores by FACS sorting (BD FACSAria II Cell Sorter) 
or antibiotics. 

For tracking of mitochondria, the BacMam 2.0 technology was used (CellLight 
Mitochondria-GFP, BacMam 2.0, C10600, Life Technologies). 
Immunohistochemistry (IHC) and immunocytochemistry (ICC). For IHCs 
and ICCs, standard protocols were used. For human brain analyses, thin (3 um) 
formalin-fixed paraffin-embedded human tissue sections from resected primary 
gliomas were obtained from the Department of Neuropathology in Heidelberg 
in accordance with local ethical approval. Human sections were incubated with 
anti-BRAF-V600E (VE1, Ventana), anti-[DH1 R132H (H09, Dianova), anti-Cx43 
(C6219, Sigma), anti-GAP-43 (8945, Cell Signaling), anti- NGF (ab52918, Abcam), 
anti-NT4 (ab150437, Abcam), anti-TrkA (ab76291, Abcam) and anti-TrkB 
(ab134155, Abcam) antibodies. If not explicitly stated, all oligodendrogliomas 
had a 1p/19q codeletion, and all astrocytomas were non-codeleted for 1p/19q. To 
detect contralateral tumour cells in human brains, large sections were analysed as 
previously described*®. 

For mouse brain analyses, animals were transcardially perfused with PBS 
followed by 4.5% paraformaldehyde (PFA). For ICCs, cells were grown on glass 
slides for 4 days and fixed with PFA. The following antibodies were used for 
10 um cryotome sections and ICCs: anti-nestin (ab6320, Abcam, specific staining 
of GBMSCs, no signal detectable in normal mouse brain), in combination with 
anti-B-catenin (ab16051, Abcam), anti-B-parvin (sc-50775, Santa Cruz), anti-beta 
tubulin ITI (ab18207, Abcam), anti-Cx26 (ab59020, Abcam), anti-Cx31 (ab156582, 
Abcam), anti-Cx37 (ab185820, Abcam), anti-Cx43, anti-GAP-43, anti-GFAP 
(Z0334, Dako), anti-Ki-67 (M7240, Dako), anti-myosin IIa (ab24762, Abcam), 
anti-myosin X (22430002, Novus Bioscience), anti-N-cadherin (ab18203, Abcam), 
and anti-PDI (ab3672, Abcam). 
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Photo-oxidation of GFP, and serial section scanning electron microscopy 
(SEM). Ten S24 GBMSC brain tumour tissue blocks were prepared for photo- 
oxidation as described previously”. Serial 70 nm thick sections of photo-oxidated 
epon-embedded tissue were produced using 3D SEM as described before**. A 
volume of 747 tm? was imaged. Specimens were imaged with a Zeiss 1530 scanning 
electron microscope. Images were aligned manually. 

Western Blot. Western blots were performed according to standard protocols. 
Total protein lysates (20-50 ug) were electrophoretically separated using a 10% 
SDS-PAGE. After blotting and blocking, the primary antibodies (see above) were 
incubated over night at 4°C. As loading control, anti-GAPDH antibody (C4780, 
Linaris), or anti-alpha-tubulin antibody (T9026, Sigma) was used. 
Electroporation/microinjection. Horizontal acute brain slices were obtained from 
2 NMRI nude mice with 131 days old S24 as described®. Patch electrodes with 
resistances of 5-10 MQ were filled with Lucifer yellow (5 mg ml-!, L0259, Sigma) 
and approached to identified tumour cells under visual control using a 63 x, NA 1.0 
dipping lens (Zeiss). The dye was transferred into tumour cells with an Axoporator 
800A (Axon instruments) by 1-ms square voltage pulses at 50 Hz. Pulse amplitude 
was adjusted between —5 V and —20 V and train duration was adjusted up to 3 s 
to receive sufficient labelling of the target cell. 

Image processing. MPLSM images were acquired by the ZEISS ZEN Software 
(Zeiss, Germany). After primary image calculation (for example, subtraction of 
different channels to remove unspecific background), images were transferred to 
Imaris (Bitplane, Switzerland) to allow 3D visualization, rendering and analysis 
of the data. For illustration of different aspects single planes, maximum inten- 
sity projections (MIPs) or 3D images were used. For exemplary illustration of 
tumour cell interconnectivity and TM branches, z-stacks were rendered manually 
(tumour cell bodies with surface function; TMs with filament tracker function). 
When a TM started at one cell and ended at another, these cells were defined 
as connected. Serial electron micrographs were reconstructed using OpenCAR 
software. 3D analysis of electron microscopy images was done using the Amira 
5.4.6 software (Visage Imaging, Richmond, Australia). Some of the data (for 
example, calcium imaging) were transferred to the Image] software (Rasband, 
WSS., ImageJ, NIH). Videos were extracted from ZEN or Imaris and edited in 
Adobe Premiere Pro CS6. 

Quantification of histology and MPLSM imaging data. In patient tumour tissue 
(only from primary tumour resections), maximum TM length was measured in 
standard 3 um thin IDH1*"“2" THC sections. Here, TMs were divided into 3 groups: 
<50 um (not qualifying as definite TM, because other cellular structures might still 
be confused with filamentous structures of this length); shorter TMs of 50-100 um, 
and longer TMs of >100 1m length. Quantitative analysis of haman IHCs was done 
by a Histoscore (range 0-300) as described before™*!. 

For in vivo imaging data, TM numbers, branches per TM and connections per 
cell were counted manually, and TM lengths were measured manually in the slice 
mode in Imaris. Cells without a TM connection were defined as “non-connected” 
and cells with at least one TM-connection as “connected”. TMs were also classified 
as connecting when the connected cell was outside the region of interest. To analyse 
the number of TMs before and after irradiation, the TMs of individual, identical 
cells were counted at both time points. The mean speed of tumour cell invasion 
in $24 shControl versus shGAP-43 tumours was determined by analysis of three 
consecutive imaging time points within a 24h time interval in vivo. Distances of 
tumour cells to the main tumour mass (defined as a radius of 0.5mm around the 
middle of the main tumour) were analysed and grouped, or displayed as individ- 
ual distances to the main tumour core in tumours that were much less invasive 
(oligodendrogliomas). Nuclei and mitochondria (time-lapse imaging data) were 
marked using the spot function of Imaris. They were connected to tracks and the 
mean track speed was calculated. For quantification and analysis of fluorescence 
intensity after SR101 application, all GFP-expressing tumour cells in a volume 
were marked using the spot function of Imaris and then mean intensities of the 
SR101-channel of these spots were calculated and compared with each other. For 
quantification of tumour volumes two regions per animal were marked using the 
surface function of Imaris. 

Quantification and analysis of calcium imaging data. Tumour cells and non- 
malignant brain astrocytes were identified by GFP/RFP expression and uptake of 
the chemical calcium indicator, and marked manually by the use of the region of 
interest manager of ImageJ. Mean grey values were measured over time. This data 
was processed by the program GNU Octave 3.8.1 (John W. Eaton, GPL): images 
were normalized to the background fluorescence using a sliding interval of +10 
images. Local maxima of calcium signals were detected by the findpeaks function 
(signal package, Octave-Forge). Thus the number of calcium peaks of each cell (N) 
could be determined and the frequency (f) was calculated. The frequency was 
standardized for the cell number of each region. Synchronous cells, the number of 
synchronous communications, and the time point of the synchronous firing were 


determined. Analysis was done in a window of 2 frames around each peak. This 
allowed to assess the synchronicity S (S € R* U {0}), which was defined by us as 
the fraction of the whole number of synchronous cells (Nsyn) divided by the num- 
ber of calcium peaks for the given cell (Nca). In case the cells were not active, a 
synchronicity of zero was allotted. 


N. 
a “for Ncq>0 
Synchronicity § = } Neca 


0 for Nq=0 


Hence, synchronicity states the average number of interactions at the same time 
point. For example, in a system with a synchronicity of 1, a firing cell interacts 
with a second one; for a synchronicity of 10, one cell is communicating with 10 
other cells. 

For the comparison between different blockers in vivo, the synchronicity was 
normalized to the baseline level. Finally, the results were summed up by a heat 
map. The number of calcium peaks of these cells were coded by a colour map. 
Synchronic cells were connected by lines, whereat the colour described the time 
point of the synchronic firing. 

For measurement of relative changes in fluorescence intensity, tumour cells 

were again marked manually and relative changes were calculated (AF/Fo). Fo was 
defined as the average intensity of the 20% lowest grey values in a region of interest. 
Quantification and analysis of MRI images. The slice with the largest tumour 
area per mouse was chosen, and both the tumour (hyperintense on T2-weighted 
images) and the whole brain were segmented manually. The ratio of these two 
areas were determined and compared between the different groups (n= 6 mice 
per group, t-tests). 
Functional characterization of differential mRNA expression of human 
gliomas. RNA sequencing raw data (mapped to genes) and curated IDH-1/2 muta- 
tion data were downloaded from The Cancer Genome Atlas (TCGA) data portal 
on 30 January 2015, and last updated on 6 May 2015. Additionally, copy-number 
calls (using GISTIC 2.0) from the cBioPortal® and 450k- as well as CNV-NME- 
clustering results from the Broad GDAC Firehose (http://gdac.broadinstitute. 
org/) were acquired. Only IDH mutant samples which clearly clustered to either 
the 1p/19q codeleted or 1p/19q non-codeleted group (and had the respective 
copy-number profile; 194 samples: 124 non-codeleted, 70 codeleted) were kept 
for further analysis. The rationale to restrict the primary analysis on IDH mutated 
gliomas was that the IDH mutation itself has a profound impact on epigenetic and 
gene expression patterns in gliomas”. First, normalization and differential gene 
expression analysis of RNA sequencing counts was performed using the edgeR 
package®’, which assumes a negative binomial distribution of count data, filter- 
ing lowly expressed transcripts. Differentially activated signalling pathways and 
downstream effects between codeleted and non-codeleted IDH mutated tumours 
were analysed with the proprietary Ingenuity Pathway Analysis (Qiagen) using a 
fold change filter of |1.5| and FDR-q < 0.05" for the input list. Briefly, the software 
calculates both an overlap P value (based on Fisher’s exact test) and an activa- 
tion z score, which is based on the expression state of activating and inhibiting 
genes, for manually curated pathways and downstream biological functions. For 
this exploratory, hypothesis-generating study, results with both P<0.1 anda 
zscore > |1.5| were kept. 

To confirm the relevance of the results for [DH wild-type astrocytomas, we 

also analysed functional transcriptomic differences between IDH wild-type, non- 
codeleted gliomas (n = 56) and IDH mutated, 1p/19q codeleted gliomas (n = 70) 
from the TCGA RNASeq data using the analysis strategy from above. As this was 
a secondary, exploratory analysis, we did not perform multiple-testing adjustments 
for the results of our primary analysis. 
Statistics. The results of image analyses were transferred to the SigmaPlot 
Software (Systat Software, Inc.) to test the statistical significance with the appro- 
priate tests (data were tested for normality using the Shapiro-Wilk test and for 
equal variance). Statistical significance was assessed by the two-sided Student's 
t-test for normally distributed data. Otherwise a Mann-Whitney test was used 
for non-normal distributions. For more than two groups a one way ANOVA or 
an ANOVA on the ranks was performed. For contingency tables, a Fisher's exact 
test was used. For Kaplan-Meier survival analysis, a log rank test was performed. 
Results were considered statistically significant if the P value was below 0.05. 
Quantifications were done blinded by two independent investigators. Animal 
group sizes were as low as possible and empirically chosen, and longitudinal meas- 
urements allowed a reduction of animal numbers by maintaining an adequate 
power. No statistical methods were used to predetermine sample size. If treat- 
ments were applied, animals were randomized to these procedures. Quantitative 
in vivo data are normally depicted as mean + standard deviation. The calculated 
calcium imaging frequency and synchronicity values were corrected for outliers 
using the Nalimoyv test. 
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Extended Data Figure 1 | Different primary glioblastoma cell lines 
(GBMSCs) growing to astrocytic tumours in the mouse brain. 

a-f, In vivo microscopy (3D) of 6 different GBMSC lines (all non- 
codeleted for 1p/19q, and IDH wild-type) reveals abundant formation of 
ultra-long membrane protrusions in the mouse brain: T1 (a), T269 (b), 
T325 (c), S24 (d), WJ (e), and P3 (f) (z-dimensions from 200-500 um 
depth). Insets show the boxed areas in the corresponding images in higher 
magnification, covering a proportion of the z-dimension. Per cell line, 
two time points are shown, adapted to their growth speed in vivo (T269, 
P3 fast; T1, S24 intermediate, T325 and WJ slow). g, 3D image of a $24 
astrocytoma (injection of a 1:1 mixture of either GFP- or RFP-positive 
cells), revealing multiple ultra-long and very thin membrane protrusions 
(arrows) in the live mouse brain. Note that membrane tubes partly 

run in parallel. h, CGH-profile of the S24 GBMSC line demonstrating 


chromosomal alterations typical for GBM (chromosome 7 gain, 10 loss). 
i, Chromosome 7 FISH analysis of one $24 GBMSC in the main tumour 
area demonstrates polyploidy: 90% of n = 100 analysed cells in the main 
tumour area were clearly polyploid for chromosome 7, indicating that 
implanted $24 GBMSCs give rise to tumours genetically identifiable as 
glioblastomas. j, Whole mouse brain coronar sections at day 171 after 
$24 injection showing two main features of glioblastoma growth: diffuse 
brain invasion in a typical dissemination pattern (left image), and a solid, 
angiogenic core identified by haemorrhagic changes of the main tumour 
area (right bright field image). k, Increasing angiogenesis in this tumour 
is further demonstrated by dynamic in vivo MPLSM. I, Actin-rich $24 
GBMSC tip, invading into the brain (single plane images; schematic 
drawing below). In vivo MPLSM: a-g, k, 1. 
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Extended Data Figure 2 | Characterization of membrane microtubes in 
astrocytoma mouse models. a, Number and length of protrusions during 
tumour progression (S24 tumours; m = 77-120 cells in n= 3 mice). 

b, MPLSM images of S24 GBMSCs genetically expressing green 
fluorescent protein (GFP, green) linked to different cellular/molecular 
components. c, Confocal immunohistochemistry (maximum intensity 
projections) of human nestin (green, allows specific detection of $24 


GFAP 


(+) 


GBMSC-related structures in the mouse brain), and different other 
cellular and molecular factors (red, co-stainings). The degree of expression 
of the factor in tumour cell-derived membrane tubes is indicated in 

the right lane. —, no signal in membrane tubes, (+), positive signal in 


some membrane tubes, +, positive signal in all membrane tubes. In vivo 
MPLSM, a, b. 
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Extended Data Figure 3 | Membrane microtube dynamics and 
morphology. a, 3D reconstruction of membrane microtubes in a T325 
astrocytoma over 3 days (in vivo MPLSM). Arrowheads, stable main tube; 
arrows, dynamic side tubes. b, Example of a very stable T325 GBMSC 
membrane microtube (arrowheads), followed over 126 days in vivo; MIP, 
z-dimension 48 um. c, Scanning electron microscopy (SEM) image of two 
photoconverted membrane microtubes (arrows) and a nucleus of a non- 
photoconverted brain cell (N). d, 3D reconstruction of serial SEM images 


GBMSC axon 


vo} 


blood vessels Tlx-GFP 


50 um 


(22.29 um (xy) x 4.62 um (z) = 102.9 um?) illustrating the membrane 
contours. e, Maximum speed of mitochondria in $24 membrane tubes 
versus tumour cell soma (n = 10 per group, f-test, red lines show means). 
f, 3D reconstruction of serial SEM sections of the membrane microtube 
(red) and the two axons (green), which are shown in Fig. 1f. g, 3D image 
of the genetic Tlx mouse glioma model, with abundant membrane 
microtubes connecting single stem-like astrocytoma cells (z-dimension 
83 um). In vivo MPLSM: a, b, e, g. *P < 0.05. 
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Extended Data Figure 4 | Origin of TM-connections between 
astrocytoma cells, and long-time tracking of TM-extending cells. 

a, Graphs illustrating two theoretically possible ways of intercellular 
connections by membrane tubes in a model of two tumour cell 
populations marked with 2 different fluorescent proteins. In hypothesis 1, 
tumour cells remain connected after cell division with their ancestors. In 
this case, only connections between cells of the same colour are expected 
(GFP-GFP (green) or RFP-RFP (red)). In hypothesis 2, tumour cells 

only connect to unrelated glioma cells. Here, 50% of connections would 
be between cells of different colour (GFP-RFP or RFP-GFP (grey)), 

and 25% of the same colour (GFP-GFP (green) and RFP-RFP (red)), 
respectively. b, Quantification of the real data set, where a 1:1 mixture of 
either GFP or RFP expressing S24 GBMSCs (S24GFP/S24RFP) was 
co-injected into the mouse brain, revealing that both potential 
mechanisms are in place (n = 164 connections in n= 3 mice). c, 3D image 
(70 days after injection) of a co-implantation of GFP- and RFP-expressing 
S24 GBMSCs. Quantification revealed that both large fluorophores (which 
cannot pass gap junctions) never colocalized in cell somata or in TMs 
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(n> 2,500 astrocytoma cells analysed). d, e, Examples of 3D images of 
membrane tube connections between individual, non-related astrocytoma 
cells that differently express GFP or RFP (arrows in d and e). f, Example 
of a 3D image of same-colour connections between two RFP-positive 
cells (arrows). g, Scanning electron microscopy image of a $24 spheroid. 
Left, yellow colour marks cell bodies, arrowheads point to membrane 
microtubes; right, high magnification of tubes with direct membrane 
contact (arrow). h, 3D images of a perivascular T325 astrocytoma cell 
(arrows), which first utilizes a TM to explore the perivascular niche 
(D45-D73) until it moves to the explored region, and remains ina 

strict perivascular position until day 255. A second cell (arrowhead) is 
quiescent until D129 and is embedded into a vascular loop formation, 
which persists after disappearance of the main cell soma. i, MIP of a 
TM-containing S24 GFP astrocytoma cell which enters a perivascular 
position over time (arrow), and another one which remains in its 
non-vascular (parenchymal) position over 105 days (arrowhead). In vivo 
MPLSM, c-f, h, i; 50-650 um deep in the brain. 
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Extended Data Figure 5 | TMs in 1p/19q codeleted versus non-codeleted 


gliomas. a, 3D image (in vivo MPLSM) of a BT088 oligodendroglioma 
xenograft tumour growing in the mouse brain, inset shows the boxed 

area in a higher magnification. Cells are rounded, TMs are scarce. 

b, Quantification of TM lengths of BT088 oligodendroglioma cells (left), 
and S24 astrocytoma cells (right), at day 60 after tumour implantation. 
n=3 animals per entity. c, IDH1*“?# jimmunohistochemistry of the 
contralateral brain hemisphere (macroscopically tumour-free) of a patient 
deceased from a WHO III astrocytoma. d, Staining of resected primary 
glioblastomas (n = 3, non-codeleted, IDH wild-type) with a mutation- 
specific antibody against their BRAFY" mutation reveals the existence 
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of long tumour-cell-derived membrane microtubes in these tumours. 
Representative image. e, Exemplary IDH1*“?# jimmunohistochemistry 
of gliomas morphologically classified as oligoastrocytoma, with (left) 

or without (right) 1p/19q codeletion. f, Maximum microtube length 

of oligoastrocytomas with 1p/19q codeletion (OA CODEL; n= 31 
patients) and without (OA NON-CODEL; n= 9 patients). g, Maximum 
microtube length of tumours morphologically classified as astrocytomas 
but with 1p/19q codeletion (“A’ CODEL,; n = 6 patients), or classified as 
oligodendrogliomas but without 1p/19q codeletion (“O” NON-CODEL; 
n=9 patients). In vivo MPLSM: a, b. 
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Extended Data Figure 6 | Intercellular communication via gap 
junctions in TM-connected astrocytoma cells, and its impact on tumour 
progression. a, Example of a calcium wave involving TMs of GBMSCs 

in a tumour region; measurement by the genetically encoded sensor 
Twitch-3 that allows ratiometric calcium measurements via FRET. Shown 
is an overlay of cpVenus©P and CFP channels. Yellow colour reflects low, 
red colour high calcium concentrations. Right: ratios of single sections of 
one TM illustrating the propagation of a calcium wave along the TM. 

b, MIP (10 slices) of the region shown in Fig. 2b (red cells, astrocytes; 
green cells, tumour cells without Rhod-2AM signal; yellow cells, tumour 
cells with Rhod-2AM signal). c, Exemplary heat map of intercellular 
calcium wave (ICW) communications between T325 astrocytoma cells 
transfected with the genetically-encoded calcium sensor GCaMP3. 

d, Heat map of the region shown in Supplementary Video 4 (small 
molecule calcium indicator Fluo-4AM). e, Frequency of calcium peaks 
recorded during brain superfusion with extracellular saline (ES-control) 
versus 100 uM carbenoxolone (CBX) in GBMSCs (blue box) and normal 
brain astrocytes (red box); n =3 mice per group; t-tests. f, Analysis of 
baseline-normalized synchronicity (see Methods for details) of calcium 
signals between S24 GBMSC glioma cells versus those between normal 
brain astrocytes. Different pharmacological blockers of main propagation 
mechanisms of ICWs were tested: inositol triphosphate was blocked by 
2-APB, cellular ATP receptors by the nonselective purinergic 2 receptor 
antagonist suramin, and gap junctions were blocked by CBX (glioma cells, 
t-tests; astrocytes, Mann-Whitney tests). ES, extracellular saline used 


0 60 70 80 90 
time [days] 


D40 


as control. g, 3D images (z-dimension 180 um) of SR101 microinjected 
tumours, without (control, upper image) and with co-injected CBX 
(lower image; area of injection: circles) 120 min. after injection. Red 
cells, normal brain astrocytes. Graph, corresponding quantification of 
SR101-fluorescence (n = 4,962-5,676 cells in n =3 mice per group; 
Mann-Whitney test). h, 3D images of a non-TM-connected $24 tumour 
cell (S24tdTomato), loaded with the gap-junction permeable dye Lucifer 
yellow via electroporation. i, 3D images of TM-connected $24 tumour 
cells (S24tdTomato) after dye transfer into one of the TM-connected cells. 
j, Quantification of Lucifer yellow fluorescence intensity in the 
neighbouring cells next to the electroporated cell (n =4 sections from 
n=2 mice; n= 64 TM-connected versus n = 42 non-TM connected cells 
quantified; t-test). k, Western blot analysis of Cx43 protein expression 

in 4 GBMSC and 2 oligodendroglioma stem-like (OSC) cell lines. 

1, Immunohistochemistry demonstrating the localization of different 
connexins in $24 GBMSCs; no clear TM-related expression, and/or 
localization at TM crossings could be observed. m, Proportion of 
TM-devoid (0 TMs) versus TM-rich (>4 TMs) cells in shControl versus 
shCx43 tumours 20 and 40 days after tumour implantation (n = 3 mice 
per group, ANOVA, Tukey’s post hoc test). n, Kaplan-Meier survival plot 
of animals implanted with shCx43 vs. shControl $24 GBMSCs (log rank 
test). a-g, m, Acquired by in vivo MPLSM. h, i, 1, Confocal microscopy 
images. For gel source data, see Supplementary Fig. 1. Scale bars show s.d. 
*P<0.05, ***P < 0.001. 
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Extended Data Figure 7 | Effects of radiotherapy on cellular 
morphology, long-term survival, tumour cell communication, and 
calcium homeostasis in astrocytomas. a, 5 days after initiation of 
radiotherapy (37 Gy), nuclear fragmentation characteristic for apoptosis 
(arrow) can be detected in a proportion of cells. Green, nuclear staining 
by H2B-GFP transduction; red, $24 cell cytoplasm. b, Representative 

42 day time course of a distinct tumour microregion, followed after start 
of radiotherapy (day 0). TM-connected cells (two examples are marked 
with black asterisks) show long-term survival; note that surviving cells 
show an increase in the number of their TMs. n =3 mice per group. 

c, Exemplary heat maps of calcium transients (Rhod-2AM) of a sham 
treated (left) and radiated GBMSC tumour region (right). d, Relative 
changes of all cells (left) and subgroups of TM-connected versus 
non-connected GBMSCs of shControl versus shCx43 tumours after 
sham/radiotherapy (n= 3 mice per group, t-tests). e-h, Ratiometric 
measurements of basal calcium levels in vivo. e, Mean ratios of 
fluorescence intensities of the FRET partners cpVenus©? and CFP, 
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before, and after two days of radiation (2 x 7 Gy) in TM-connected cells 
(n=3 mice per group; Mann-Whitney test). f, Fluorescence intensities 
(normalized by the mean intensities of the corresponding data sets) in 
TM-connected cells for the two FRET partners illustrated by a scatter 
blot (black dots represent analysed cells at the day before radiotherapy, 
red dots 2 days after initiation of radiotherapy); linear regression 
revealed similar correlation strengths at the two time points (n = 3 mice), 
reflecting very homogenous calcium levels in the astrocytoma cells 
before and after radiotherapy. g, Mean ratios of fluorescence intensities 
of the FRET partners before and after two days of radiation (2 x 7 Gy) 
in non-connected cells, n = 3 mice; Mann-Whitney test. h, Normalized 
fluorescence intensities in non-TM-connected cells for the two FRET 
partners. Here, linear regression revealed highly homogeneous basal 
calcium levels only before radiotherapy, while after radiotherapy the 
linear correlation was lost, illustrating heterogeneous calcium levels in 
the analysed cells. (n =3 mice per group). All data acquired with in vivo 
MPLSM. GBMSCs, S24 cell line. 
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Fey Receptor. mediated Phagocytosis in Macrophages and Monocytes. 1,95E+00 1,18E-01 3,317 CDC42,SYK INPPSD, VASP, PLD4,FYB,AKT2,ARPC 1B,FCGR2A, LYN,FCGR3AFCGR3B 
IL-8 Signaling 3,88E+00 1,26E-01 3,3 JUN,IQGAP1,GNB4,MTOR,ITGB5,PLCB2, VASP. PGF,GNB1,PIK3C2B,RHOC , PLD4,/TGB2,AKT2,VCAM1,GNAI3,FOS, NRAS,GNG5,GNG4,NFKBIB,GNG12,CYBB 
Acute Phase Response Signaling 2,38E+00 1,07E-01 3,207 JUN,MTOR, TF,C3, TNFRSF1A, TNFRSF1B,SERPINA3,C1R,AKT2,FOS,A2M,SOD2,NRAS,C4A/C48,FN1,NFKBIB,FTL,C1S 
Integrin Signaling 274E+00 1,08E-01 3,13 CDC42,ITGBS,VCL,VASP,PIK3C2B, RHOC,PARVA,PDGFB, ITGB2,ACTN1,RAP1A,AKT2,ARPC1B,NRAS,CAV1,CAPN5,ACTN2,PAK4,TNK2,|TGB4,CAPNS1 
Phospholipase C Signaling 1,61E+00 8.66E-02 2,887 GNB4,PLCB2,GNB1,RHOC,PLD4,RAP1A,FCGR2A,RAPGEF3,NRAS,SYK,GNG5,GNG4, ARHGEF1,GNG12,HDAC4, HDACS,HDAC1,ADCYB,AHNAK,LYN 
Role of NFAT in Regulation of the Immune Response 4.08E+00 1,33E-01 2,887 JUN,GNB4,PLCB2,GNB1,PIK3C2B,AKT2, FCGR2A, GNAI3,HLA-DRA FOS, NRAS, SYK .HLA-DRB1,GNGS5,HLA-DMA, GNG4,HLA-DMB,NFKBIB,GNG12,CD4,LYN,FCGR3A/FCGR3B 
Signaling by Rho Family GTPases 1,84E+00 9,01E-02 2,84 SEPT8,JUN,|QGAP1,CDC42,GNB4,MSN,GNB1 ,PIK3C2B, RHOC,ARPC1B,GNAI3,GFAP, VIM,FOS,GNG5,CYFIP1,GNG4,ARHGEF1,GNG12,PAK4,CYBB 
Production of Nitric Oxide and Reactive Oxygen Species in Macrophages 2.44E+00 1,06E-01 — 2,828 JUN, INFRSF1A, INFRSF1B,APOD ,PIK3C2B,JAK1,RHOC SP|1,PPP2R2C, RAP1A,AKT2,FOS,TLR4,APOC2,PPP2R5A,NFKBIB,APOE,CYBB,CYBA 
HMGB1 Signaling 41,60E+00 1,02E-01 2,714 FOS,TLR4,JUN,NRAS,CDC42, TGFBI, INFRSF 1A, TNFRSF 1B, PIK3C2B,RHOC,AKT2,VCAMI 
CXCR& Signaling 2.89E+00 1,19E-01 2,873 JUN,GNB4,PLC32,GN81,PIK3C2B,RHOC,EGR1,AKT2,GNAI3, FOS, NRAS,GNGS5,GNG4,GNG12,CD4,PAK4 ADCY8,LYN 
TNFR‘ Signaling 1,91E+00 149E-01 2,646 FOS,JUN,CDC42, TNFRSF1A,NFKBIB,CASP9,PAK4 
Complement System 4.04E+00 250E-01 2,646 C1QC,C4A/C4B,C3,C10B,TGB2,C1S,C1R,C3ARI,CIQA 
PKC8 Signaling in T Lymphocytes 1,08E+00 885E02 2,83 FOS,JUN,NRAS,HLA-DRB1,HLA-DMA,HLA-DMB,NFKBIB,PIK3C28,CD4,HLA-DRA 
Rac Signaling 1,30E+00 9,71E-02 2,53 JUN,IQGAP1,NRAS,CDC42,CYFIP1, PIK3C2B,PAK4,CYBB,ARPC1B,CD44 
MIF Regulation of Innate Immunity 4,67E+00 146E-01 2,449 FOS,TLR4,JUN,CD74,CD14,NFKBIB 
iNOS Signaling 212E+00 1,63E-01 2,449 FOS,TLR4,JUN,HMGA1,CD14,NFKBIB,JAK1 
IL-6 Signaling 1,02E+00 862E-02 2,333 FOS,JUN,A2M,NRAS, TNFRSF1A,CD14, TNFRSF1B,NFKBIB,PIK3C2B,AKT2 
PI3K Signaling in B Lymphocytes 1,48E+00 976E-02 2,309 FOS, TLR4,JUN.NRAS,SYK,INPP5D,C3,PLCB2,NFKBIB,PLCL2,AKT2,LYN 
Calcium-induced T Lymphocyte Apoptosis: 41,04E+00 1,03E-01 2,236 HLA-DRB1,HLA-DMA,HLA-DMB,CD4,HDAC1,HLA-DRA 
‘Type | Diabetes Mellitus Signaling 1,56E+00 1046-01 = 2,236 HLA-E,HLA-DRB1,HLA-DMA, TNFRSF1A, TNFRSF1B,PTPRN,HLA-OMB,NFKBIB.CASP9,JAK1,HLA-DRA 
Role of Pattern Recognition Receptors in Recognition of Bacteria and Viruses 1,25E+00 9,24E-02 2,121 TLR4,C1QC,SYK,C3,TGFB1,IRF3,C1QB,PIK3C2B,RIPK2,C3AR1,C1QA 
LPS-stimulated MAPK Signaling 4,38E+00 1,10E-01 2,121 FOS,TLR4,JUN.NRAS,CDC42,CD14,NFKBIB,PIK3C2B 
JAK/Stat Signaling 1,41E+00 1,11E-01 = 2,121 FOS,JUN,NRAS,MTOR,CDKN1A,PIK3C2B JAK1,AKT2 
Paxillin Signaling 1,79E+00 1,12E-01 2,121 NRAS,CDC42,ITGB5,VCL ACTN2,PIK3C28, ITGB2,PARVA,ACTN1 PAK4,ITGB4 
(Cde42 Signaling 2,40E+00 1,18E-01 2,121 JUN,HLA-DPB1,IQGAP1,CDC42,HLA-E, FNBP1L,ARPC18,HLA-DRA,FOS,HLA-DRB1 ,HLA-DMA,HLA-DMB,HLA-DPA1,PAK4,TNK2 
TNFRO Signaling 1 tE+00 1436-01 2 FOS,JUNTNFRSPTE NFKBIB 
Glioblastoma Multiforme Signaling 1,31E+00 897E-02 1,941 CDC42,MTOR,PLCB2,WNTSA,WNT7B,PIK3C2B,RHOG, PDGFB,FZD7 AKT2,NRAS,CDKN1A,PLCL2 
PDGF Signaling 2,09E+00 1,30E-01 1,397 FOS,JUN,NRAS,CAV1,INPPS5B, INPP5D,PIK3C28,JAK1,INPPL1,PDGFB- 
fMLP Signaling in Neutrophils 2,31E+00 1,21E-01 1,897 CDC42,GNB4,PLCB2,GNB1,PIK3C2B,ARPC1B,GNAI3,NRAS,GNG5,GNG4 NFKBIB,GNG12,CYBB 
TGF-B Signaling 1,74E+00 1,15E-01 1,89 FOS,JUN, TGFBR2,NRAS,CDC42,TGFB1,SMAD3,BMP2,HDAC1,BMPT 
ILK Signaling 486E+00 1,39E-01 1,877 JUN,CDC42,TNFRSF1A,PGF,LIMS2,RHOC,PARVA,ITGB2,BMP2,VIM,FOS,FN1,PPP2R5A, ACTN2,FLNC,MYH7,MTOR, ITGBS,FLNA, PIK3C2B, PPP2R2C,ACTN1,AKT2ITGB4,PPAP2B 
Colorectal Cancer Metastasis Signaling 3.52E+00 1,13E-01 1,789 JUN,GNB4, TGFBI, TNFRSF1A,PRKAR2B,WNTSA,PGF,GNB1,RHOC,FZD7, TLR4,FOS, SMAD3, GNG4,ADCY8, WNT7B, CASP9 JAK1,PIK3C2B, AKT2, TGFBR2,NRAS,GNGS,GNG12,MMP24,MMP14 
p7OS6K Signaling 1,60E+00 1,02E-01 1,732 NRAS,SYK,MTOR,PPP2R5A,PLCB2,PIK3C28 JAK 1,PPP2R2C,PLCL2,AKT2,GNAI3, LYN 
Ceramide Signaling 2.95E+00 150E-01 1,732 FOS,JUN,NRAS,PPP2R5A, TNFRSF1A, TNFRSF1B,SMPD3,S1PR3,PIK3C2B,PPP2R2C,AKT2,S1PR1 
Renal Cel Carcinoma Signaling .S7E00 1SGE-01 11867 FOSIUNNRAS DCE? TGF PHC3CZB,POGFB ETS! PAK RAPA AKT? 
Neurotrophin/TRK Signaling 1,17E+00 1,04E-01 1,633 FOS,JUN,NRAS,CDC42,RPS6KA1,PIK3C2B NTRK2 
EGF Signaling 410E+00 1,07E-01 1,833 FOS,JUN,MTOR,PIK3C2B,JAK1,AKT2 
12 Siang Ve4e+0) 120601 1/888 FOSTUNNRAS SVK PIKSCOB INKY AKT 
IL-1 Signaling 2,02E+00 1,21E-01 1,633 FOS,JUN,GNB4,GNG5,PRKAR2B,GNG4,NFKBIB,GNB1,GNG12,ADCY8,GNAI3 
‘@-Adrenergic Signaling 2.246400 129E-01 1,833 NRAS,GNB4,SLCSA3,GNG5,PRKAR2B,GNG4,GNB1,GNG12,PHKG1,ADCY6,GNAI3 
Cardiac Hypertrophy Signaling 41,54E+00 8.64E-02 4,5 JUN,GNB4, MTOR, TGFB1 ,PLCB2,PRKAR28,GNB1 PIK3C2B, RHOC,CACNA1A,GNAI3,NRAS, TGFBR2, GNG5,GNG4,RPS6KA1,GNG12,PLCL2,ADCY8 
Endothelin-1 Signaling 1,72E+00 952E-02 1,5 JUN,EDNRA,PLCB2,EDNRB, CASP9,PIK3C2B, GUCY 1A3,PLD4,GNAI3, PRDX6,FOS,NRAS, RARRES3,LCAT,PLCL2,ADCY8 
Leukocyte Extravasation Signaling 3,14E+00 1,14E-01 4,5 CDC42.MLLT4,VCL,SIPA1,CD99, VASP,MSN, PIK3C2B,RAP1 GAP, ITGB2,ACTN1,RAP14,CD44, CAM, RAPGEF3,GNAI3, SELPLG,ACTN2,CYBB,CYBA,MMP24. MMP 14 
053 Saraing 1O9E+00 9136-02 4.889 JUN.CDKNA PKQG28,GADD#SB GADDASA HDACS AKT? HAST, TNFRSF108 
RhoGD!I Signaling 1,35E+00 872E-02 -2,333 CDC42.GNB4,MSN,GNB1,RHOC,ARPC18,CD44,GNAI3, ARHGDIB,GNG5,GNG4,ARHGEF1,GNG12,PAK4, WASF2 
PPAR Signaling 1,28E+00 1,00E-01 -2,828 FOS,JUN,NRAS,PPARGC1A, INFRSF1A, TNFRSF1B,NFKBIB,NR2F1,PDGFB 
PPARa/RXRa Activation 4,21E+00 8486-02 -3 JUN, ITGBS, TGFB1 PLCB2,PRKAR28,FASN, NRAS, TGFBR2,PPARGC1A,SMAD3,NFKBIB,NR2F1,PLCL2,ADCY8: 
Extended Data Figure 8 | In silico analysis of 1p/19q codeleted versus in 1p/19q non-codeleted tumours (compared to codeleted tumours), 
non-codeleted IDH mutated human gliomas. Biological function blue the other way round. Note the activation of “cellular movement 
analysis of 1p/19q non-codeleted (n = 124) versus 1p/19q codeleted and “cell-to-cell signaling” in non-codeleted tumours. c, Results of the 
(n=70) human gliomas of the TCGA database was performed using analysis of canonical pathways in 1p/19q non-codeleted versus codeleted 
Ingenuity Pathway Analysis. All tumours analysed were IDH mutated gliomas. Higher positive z-score: upregulated in 1p/19q non-codeleted 
(GCIMP-+). a, Bar plot of the top differentially regulated downstream versus codeleted gliomas; higher negative z-score: upregulated in 1p/19q 
biological functions. b, Heat map of downstream biological functions. codeleted gliomas versus non-codeleted gliomas. 


The map is colour coded: more intense orange means more activation 
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Extended Data Figure 9 | Proficiency for GAP-43 expression drives 
malignant features associated with TMs. a, TrkA, TrkB, NGF and 

NT-4 protein expression detected by immunohistochemistry in 1p/19 
codeleted versus non-codeleted human gliomas (n = 8 each, t-tests, all 
IDH mutated). b, Western blot analysis of GAP-43 protein expression of 
different glioma cell lines. OSC, oligodendroglioma stem-like cell lines. 
c, GAP-43 western blot of 4 GBMSC lines cultured under non-adherent, 
stemlike (SC +) versus differentiating, serum-containing, adherent (SC —) 
conditions. d, In vivo 3D images of $24 shControl versus shGAP-43 
GBMSCs (left) and quantification of TM side branches 20 days after 
implantation (n= 60 cells in n =5/6 mice, t-test). e, Spheroid invasion 
assay from S24 shControl versus shGAP-43 cells in a gel matrix, and 

the corresponding quantification (t-test). f, In vivo tumour cell invasion 
distance within 24 h of $24 shControl versus shGAP-43 GBMSC tumours 
(n=3 mice, Mann-Whitney test). g, In vivo proliferation dynamics 

in the main tumour area (volume of 0.037 mm?; n = 4 mice, Mann- 
Whitney tests). h, Fraction of TM-connected cells at day 20 in these 
tumours (n = 164 cells in n=6 mice, t-test). i, Western blot analysis of 
Cx26 (expressed in normal astrocytes), Cx31 and Cx37 (both located on 
chromosome Ip), and Cx43 protein expression in shGAP-43 GBMSCs 


versus shControls. Of note, the GAP-43 knockdown leads to a Cx43 
protein reduction of 89%, while expression of the other connexins was not 
reduced. j, T2 MRI images of $24 shControl versus shGAP-43 tumours, 
72 days after tumour implantation. Quantifications of n = 6 animals per 
group (t-test). k, Kaplan-Meier survival plot of $24 shControl versus 
shGAP-43 tumour-bearing mice (log rank test). 1, Exemplary brain 
sections with nestin immunohistochemistry of S24 shControl versus 
shGAP-43 tumours 60 days after radiotherapy. Note that in shGAP-43 
tumours, only small remnants of tumour cells can be detected by the 
tumour cell-specific staining. Regions with highest tumour cell 

densities (boxes) were quantified for proliferation index (Ki-67-positive 
cells/all cells; n = 3 animals; t-test). m, Overexpression of GAP-43 in 
BT088 oligodendroglioma cells results in protein levels similar to that 

in GBMSCs. n-p, GAP-43 overexpression in BT088 oligodendroglioma 
cells leads to an increase in TM numbers (n, n= 80 cells in n =3 mice per 
group), more TM branches (0, n = 40 cells in n = 3 mice per group), and 

a higher invasion capacity (p, n= 75 cells in n=3 mice per group; t-tests) 
14 days after tumour injection. Scale bars show s.d. Red lines show means. 
In vivo MPLSM, d, f-h, n-p. For gel source data, see Supplementary Fig. 1. 
*P<0.05, **P<0.01, ***P<0.001. 
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Extended Data Figure 10 | Schematic illustration of the role of TMs in brain tumour progression. Anatomical and molecular mechanisms of 
TM-driven tumour dissemination and network function in astrocytomas. MV, microvesicles; mito, mitochondrion; ER, endoplasmic reticulum; 
MT, microtubules. 
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Overflow metabolism in Escherichia coli 
results from efficient proteome allocation 


Markus Basan)*, Sheng Hui!*, Hiroyuki Okano!*, Zhongge Zhang’, Yang Shen®, James R. Williamson* & Terence Hwa!* 


Overflow metabolism refers to the seemingly wasteful strategy in which cells use fermentation instead of the more efficient 
respiration to generate energy, despite the availability of oxygen. Known as the Warburg effect in the context of cancer 
growth, this phenomenon occurs ubiquitously for fast-growing cells, including bacteria, fungi and mammalian cells, but 
its origin has remained unclear despite decades of research. Here we study metabolic overflow in Escherichia coli, and 
show that it is a global physiological response used to cope with changing proteomic demands of energy biogenesis and 
biomass synthesis under different growth conditions. A simple model of proteomic resource allocation can quantitatively 
account for all of the observed behaviours, and accurately predict responses to new perturbations. The key hypothesis 
of the model, that the proteome cost of energy biogenesis by respiration exceeds that by fermentation, is quantitatively 
confirmed by direct measurement of protein abundances via quantitative mass spectrometry. 


Under anaerobic conditions, organisms ranging from bacteria to mam- 
malian cells excrete large quantities of fermentation products such as 
acetate or lactate. Notably, the excretion of these fermentation products 
occurs widely even in the presence of oxygen in fast-growing bacteria 
and fungi!) as well as mammalian cells including stem cells, immune 
cells and cancerous cells*-’. This seemingly wasteful phenomenon, in 
which fermentation is used instead of the higher ATP-yielding respira- 
tion process for energy generation, is generally referred to as overflow 
metabolism (or the Warburg effect in the case of cancer*~’). Various 
rationalizations of overflow metabolism as well as regulatory schemes 
have been proposed over the years”**”°, However, quantitative tests 
of the proposed hypotheses as well as systematic characterization of 
overflow metabolism are generally lacking. 

In this study, we provide a quantitative, physiological study of over- 
flow metabolism for the bacterium E. coli. We report an intriguing set 
of linear relations between the rates of acetate excretion and steady- 
state growth rates for E. coli in different nutrient environments and 
different degrees of induced stresses. These relations, together with the 
recently established concept of proteome partition”’, led us to a simple 
theory of resource allocation, which can quantitatively account for all 
of the observed behaviours, as well as accurately predict responses to 
new perturbations. Key parameters of the theory regarding the pro- 
teome costs of energy biogenesis by respiration and by fermentation 


were determined by quantitative mass spectrometry following a 
coarse-graining approach. These results suggest that overflow metab- 
olism is a programmed global response used by cells to balance the 
conflicting proteomic demands of energy biogenesis and biomass syn- 
thesis for rapid growth. 


Threshold-linear response of acetate overflow 

Previous studies have established a strong positive correlation between 
the rate of acetate excretion and the dilution rate for various strains of 
E. coli grown in glucose-limited continuous culture!®?2-*4 (Extended 
Data Fig. la-e). Here, we measured acetate excretion and growth rates of 
a wild-type E. coli K-12 strain grown in minimal medium batch culture 
with a variety of glycolytic substrates as the sole carbon sources (black 
symbols in Fig. 1). Notably, the rate of acetate excretion per biomass, 
Jac, exhibits a simple threshold-linear dependence on growth rate A, 


Sac’ (A ~ Age) for \> ac 
= (1) 
0 for \< rae 

with a linear dependence above a characteristic growth rate 
ac 0.76h7!, or 55 min per doubling), below which acetate excre- 
tion disappears. We refer to this linear relation as the acetate line (red 
line in Fig. 1). 


Strain Description Carbon Symbol _ Figure 1 | Acetate excretion under carbon 

NCM3722. WT Glucose O limitation. Acetate excretion rate (J,,) is linearly 

NCM3722. WT Lactose A correlated with the growth rate (\) for wild-type 
= NCM3722. WT Glycerol oO (WT) cells grown in minimal medium with various 
z _ NCM3722_ WT Other ® glycolytic carbon sources (black symbols), and for 

7 Naiz4g Uptake err O cells with titratable or mutant uptake systems (purple 

x titration symbols) (Extended Data Table 1). Black diamonds 
= eae Uptake Rae A indicate various carbon sources supplemented with 
2 titration seven non-degradable amino acids (AA). The red line 

NQ636 shows the best-fit of all the data to equation (1). 

nossa = 9PK Glycerol 

mutant 
, .NQ640 
1.5 NQ3722. +7AA Various ° 


Growth rate 4 (h-') 
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For strains with titratable carbon uptake systems (Extended Data 
Table 1), the same linear dependence is seen for acetate excretion 
(Fig. 1, purple circles and triangles). These results suggest that acetate 
overflow is an innate response that depends on the degree of carbon 
influx and not specifically on the nature of carbon sources. A vivid 
demonstration of this effect is seen by the behaviour of cells grown on 
glycerol: wild-type E. coli cells grow in glycerol minimal medium at a 
rate that is below \,., and do not excrete acetate (Fig. 1, black square), 
in accordance with equation (1). Three isogenic strains expressing dif- 
ferent mutant forms of glycerol kinase”? grew at rates faster than Aac 
and excreted acetate with rates dictated by their respective growth rates 
according to equation (1) (Fig. 1, purple squares). Instead of changing 
the carbon influx, reducing the metabolic demand of cells for carbon 
by supplementing minimal medium with non-degradable amino acids 
resulted in significantly enhanced growth rates and concomitantly 
increased acetate excretion as described by equation (1) (black 
diamonds in Fig. 1). 


Coarse-grained model of proteome allocation 

Linear growth rate dependences arose in previous physiological 
studies*!-°8 from the limited capacity of ribosomes to synthesize pro- 
teins and the obligatory need for increased ribosomal proteins at faster 
growth”!”’, Here, we address the problem of acetate excretion with a 
phenomenological resource allocation model, balancing the demand 
of the proteome for biomass synthesis with the demand for energy 
biogenesis. 

We focus on acetate excretion for growth on glycolytic substrates 
(Fig. 1); other substrates metabolized by alternative pathways exhibit 
similar trends although with quantitative differences (Extended 
Data Fig. 1f), probably arising from the same underlying principles 
as those described here. In our model (detailed in Supplementary 
Note 1A), acetate excretion is considered as a measure of the carbon 
flux directed towards energy biogenesis by (oxidative) fermentation, 
catalysed by glycolytic enzymes and completed by the oxidative 
phosphorylation system (for the conversion of NADH to ATP in an 
aerobic environment) (Extended Data Fig. 2a). Energy biogenesis 
by respiration is catalysed by enzymes of the glycolysis and tricar- 
boxylic acid (TCA) pathways, and the oxidative phosphorylation 
system (Extended Data Fig. 2b). Both the fermentation and respira- 
tory pathways draw carbon flux away from biomass synthesis, via 
the carbon fluxes Jc and Jc,, respectively, and in turn produce the 
energy fluxes Jp ¢and Jp, (Box 1). Let the abundance of the enzymes 
used for fermentation and respiration be given by the fraction ¢ 
and @, respectively, of the total protein content of the cell. All other 
metabolic activities, including catabolism, anabolism and ribosome 
synthesis (referred to as biomass synthesis), are provided by the 
remaining part of the proteome. Previous studies have established 
the growth-rate dependence of the proteome fraction for biomass 
synthesis”°?!,278, denoted here as dgm(A). It is coupled to energy 
biogenesis via the constraint 


b+ et Ogy(A) = 1 (2) 


The total energy flux generated must satisfy the energy demand for cell 
growth (denoted by Jp(A)), that is, 


Tee + Jer = Ju) (3) 


At the same time, not too much carbon should be diverted from the 
total influx Jc jn order to meet the demand for biomass synthesis (flux 
denoted by Jcpm(A)), that is, 


Toyin( A) = Joe + Toe + opm (A) (4) 


To a large extent, this allocation depends on the efficiencies of 
the energy biogenesis pathways. There are two very different effi- 
ciencies. It is well known that respiration has a much lower carbon 
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cost—the energy flux generated per carbon is larger for respiration 
than fermentation”, although this advantage of respiration is lim- 
ited to the presence of oxygen (Extended Data Fig. 2). On the other 
hand, if respiration has a higher proteome cost, that is, if the energy 
flux generated per proteome fraction devoted to the respective path- 
way, r= Je, bp €r=JE,/ Pp is lower for respiration than fermenta- 
tion, e¢>€, as has been suggested previously'*!””°, then a scenario 
emerges that may qualitatively explain the observed disappearance 
of acetate flux at slow growth rates. As illustrated in Box 1, when the 
carbon uptake rate (Jc in) is high and the cell has the potential to grow 
rapidly, it is advantageous, that is, growth rate can be maximized, to 
generate energy by the more proteome-efficient fermentation path- 
way, so that more of the proteome can be directed towards biosyn- 
thesis as required for rapid growth. Conversely, when carbon uptake 
is low (small Jc in), it is advantageous to generate energy by the more 
carbon-efficient respiration pathway, so that more carbon flux can be 
directed to biosynthesis and sustain growth. This proteome allocation 
model predicts the carbon flux for respiration to change in the oppo- 
site way from that found for fermentation. Just as the fermentation 
flux can be determined from acetate excretion, the respiration flux 
can be deduced by measuring the rate of CO) evolution in a bioreactor 
(Supplementary Note 2). Indeed, this respiration flux exhibits a linear 
increase with decreasing growth rate as acetate excretion diminishes 
(Extended Data Fig. 3a). 


Testing the model by growth perturbations 

If acetate excretion is the result of the coordination of energy demand 
with carbon influx given constrained proteomic resources as assumed. 
in the model, then the overexpression of useless proteins, which 
reduces the proteome fractions available for energy production and 
biomass synthesis”!, should yield higher acetate excretion rates. In 
fact, previous studies reported acetate excretion at slow growth rates 
with protein overexpression*". To test this hypothesis systematically, 
we expressed large amounts of LacZ by growing strain NQ1389 
(Extended Data Table 1) on several glycolytic carbon sources. Plotting 
acetate excretion against growth rate for varying degrees of LacZ over- 
expression leads to a simple proportionality relation between growth 
rate and acetate excretion rate for each carbon source tested (Fig. 2a). 
Moreover, plotting acetate excretion against the corresponding 
degree of LacZ expression (fraction $7 of total cellular proteins), we 
find a similar linear decrease in acetate excretion rate (Extended Data 
Fig. 4). Finally, in a 3D plot of acetate excretion rates, LacZ abundance 
and growth rates (Fig. 2b), the different data points are found to lie on 
a single plane anchored by the acetate line (red) (see also Extended 
Data Fig. 4c). On this plane, acetate excretion increases linearly with 
LacZ overexpression at each fixed growth rate (black lines). However, 
for each fixed level of LacZ abundance, the plane produces a parallel 
shift of the standard acetate line (thin red lines). These lines are 
still described by equation (1), with an identical slope, but with a 
reduction of the threshold growth rate, ,,, linear with increasing 
LacZ abundance (cyan line), that is, 


A(Oz) = Aae’ (1 — 6,1) rax) (5) 


in which dmax ¥ 47% is the extrapolated limit of useless protein expres- 
sion at which growth rate vanishes (alternatively determined from 
individual lines in Extended Data Fig. 4), in agreement with previous 
work?!?7?8, More quantitatively, this result is displayed in Fig. 3a, in 
which interpolated acetate excretion rates for constant LacZ levels are 
presented. 

The concepts represented by equations (2)-(4) are transformed into 
a quantitative model (as illustrated in Box 1 and detailed in 
Supplementary Note 1A) by implementing a simple set of relations. 
First, the proteome fraction @gy responsible for biomass synthesis 
under carbon-limitation follows a linear growth-rate dependence, 
that is, 6,.,(A) =) + bd, as established by previous studies*!?”*. 
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BOX | 
Resource allocation model for energy biogenesis 


Top left, efficiencies of energy production. Fermentation and respiration pathways for energy biogenesis are shown in the red and blue boxes, 
respectively. The model assumes that for the same energy flux generated (width of yellow arrows), fermentation needs to draw more carbon 
flux than respiration (compare the width of light grey arrows), but requires smaller amount of proteins (compare the number of red and blue 
proteins). Top, model summary. The model consists of three resource-balance equations. (1) Carbon flux Ucn) is used for energy production 
via fermentation or respiration Vc;, Jc), and to provide precursors for biomass production (GA). (2) Fermentation and respiration pathways 
supply ATP flux (Jes, Jer) that satisfies the energy demand of the cell (cA). (3) The proteome fraction required for biomass synthesis (¢9 + 5A) 


depends linearly on the growth rate, thereby constraining the proteome fraction available for energy biosynthesis (+, ¢r). Bottom, model 
predictions. Under carbon limitation, the model predicts threshold-linear dependences of fermentation and respiration with changing growth 
rate. Respiration (blue line) gradually replaces fermentation (red line) as the growth rate decreases. Proteome limitation by expression of 
useless proteins results in a horizontal shift of the acetate line. Translational limitation by antibiotics results in an increased slope of the 
acetate line with a fixed y-intercept. Energy dissipation also leads to a parallel shift of the acetate line. But unlike proteome limitation, which 
‘compresses’ both the fermentation and respiration sectors, these two sectors both increase with decreasing growth rate under energy 
dissipation (for fixed carbon uptake). The behaviours summarized in these plots are derived quantitatively in Supplementary Note 1 and 


validated in Figs 1-3. 
Model summary 
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Second, empirical evidence’’ indicates linear relations between meta- 
bolic fluxes and the abundances of the corresponding proteome sectors, 
which we capture by the equations Jo ¢= Kbps Jor = KrOrs Jpg= Errand 
Jer = €r@r. Finally, we introduce proportionalities of biomass and energy 
demand to the growth rate (Jo gy(A) = GA: Jg(A) = 0), relations that 
are demonstrated experimentally (see Supplementary Notes 1D2 and 
1D4). (Maintenance energy is negligible over the growth-rate range 
studied*”.) The detailed meaning of each parameter introduced here is 


given in Extended Data Table 2. Most important among them are @¢ 
and ¢, the proteome efficiencies of energy biogenesis by the fermen- 
tation and respiration pathways, respectively. 

Equations (2)-(4), together with the linear relation between the 
proteome fractions and fluxes, describe all key features of the experi- 
mental data as detailed in Supplementary Notes 1B and 1C and 
illustrated in Box 1 (bottom): the model naturally gives rise to 
the observed threshold-linear form of acetate excretion equation (1), 
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Figure 2 | Effect of protein overexpression on acetate excretion. 

a, Measured acetate excretion rate is plotted against growth rate for 
increasing degrees of the (useless) expression of LacZ in strain NQ1389, 
for several carbon sources indicated by circles of different colours. Thick 
red line is the acetate line of wild-type cells shown already in Fig. 1, and 
the thin lines are model predictions (equation (S26) in Supplementary 


with formulae for the threshold \,. and the slope sa. given by 
equations (S15) and (S16) in Supplementary Information; the linear 
decrease in energy-related CO, production upon increasing growth 
rate (Extended Data Fig. 3a) is captured by equation (S17), with the 
corresponding threshold <9, , and slope sco,,, given by equations 
(S18) and (S19). Furthermore, "the parallel shifts of the acetate line 
for a constant level of protein overexpression (Fig. 3a, and thin 
red lines in Fig. 2b) are captured by equation (S30) in Supplementary 
Information, while the direct proportionalities between acetate 
excretion rates and growth rate upon varying the degree of overex- 
pression (thin solid lines in Fig. 2a) are captured by equation (S26). 
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Figure 3 | Effect of genetic and environmental perturbations on the 
acetate line: model and experiments. In all subplots, the thick red line 
represents the acetate line of wild-type cells (Fig. 1). a, For each fixed 

level of LacZ expression, growth of the overexpression strain NQ1389 

on different carbon sources leads to parallel shifts of the acetate line, as 
demonstrated by the thin red lines, whose slopes are fixed to that of the 
wild-type acetate line and were fitted by only adjusting the threshold 
growth rates Az. b, Acetate excretion rate with glucose uptake titration 
(Pu-ptsG, Extended Data Table 1) for a AflhD strain (NQ1388) anda 
AfliA strain (NQ1539), both incapable of expressing motility proteins, 
regarded as ‘useless’ in well-stirred culture. Because the motility proteins 
are only expressed significantly as growth rate decreases under carbon 
limitation (Extended Data Fig. 4f), acetate excretion deviates from the 
acetate line as growth rate decreases. The shift of the threshold growth rate 
Aac in the two strains is quantitatively consistent with the model prediction 
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Information), one for each carbon source of respective colour. G6P, 
glucose 6-phosphate. b, 3D plot of the data in a. The data lie largely on a 
plane spanned by the acetate line (thick red line) defined in Fig. 1, and the 
cyan line, \a-(@z), which defines a linear shift in the threshold growth rate, 
Aac for different degrees of LacZ expression, as predicted by the model 
(equation (5)). 


Notably, the data imposes a set of quantitative constraints on the model 
parameters, in particular, 


co," 
Fis = 1.5, (6) 


- ac 


predicting that fermentation is at least 50% more efficient for energy 
biogenesis than respiration in terms of proteome cost (see equations 
(S20)-(S22) in Supplementary Information for a derivation). 

To test quantitatively the proteome allocation model, we performed 
additional sets of experiments designed to perturb individual model 
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(equation (5)), based on quantification of motility proteins (Extended 
Data Fig. 4f and Extended Data Table 3). c, For each fixed (sub-lethal) 
dose of chloramphenicol (Cm) in the medium, acetate excretion rates 
were determined for different degrees of lactose uptake by the titratable 
LacY strain (NQ381, Extended Data Table 1). The thin red lines are 
model fits with the slope as the only fitting parameter. d, For the energy 
dissipating mutant (NQ1313) expressing the proton-leaking LacY*!””", 
acetate excretion rates (triangles) obtained from titrating the glucose 
uptake system (Extended Data Table 1) show a parallel shift of the 
acetate line (thin red line), obtained by fitting the data by adjusting 
only the threshold growth rate in accordance with model prediction 
(equation ($32) in Supplementary Information). For comparison, 
acetate excretion in cells expressing wild-type LacY from the same 
plasmid system (NQ1312, circles) adheres much closer to the acetate line 
of wild-type cells. 
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parameters. To see whether a decrease in acetate excretion is possi- 
ble, we examined two independent mutants (Af/hD and AfliA) in 
which motility proteins, ‘useless’ in well-shaken batch culture, are not 
expressed. These mutants exhibit reductions in acetate excretion (open 
symbols in Fig. 3b), in accordance with the prediction of the resource 
allocation model as more proteome becomes available for energy bio- 
genesis. (The nonlinear dependence of acetate excretion arises in this 
case due to the growth-rate dependence of motility protein expression 
in wild-type cells”®, as shown in Extended Data Fig. 4f.) Translational 
limitation by sub-lethal doses of the antibiotic chloramphenicol inhibits 
peptide elongation and makes the cell respond by allocating a larger pro- 
teome fraction to ribosomes”!. In our model, this affects the parameter 
bin dgm(A), and therefore predicts an increased slope of the acetate 
line with an identical y-intercept (equation (S14) in Supplementary 
Information, solid lines in Fig. 3c), which is in good agreement with 
the data (open symbols in Fig. 3c). We also investigated the effect of 
energy dissipation on acetate excretion by expressing a mutant lactose 
transporter LacY“!”’¥ (‘leaky LacY’) known to leak protons across the 
inner membrane*? (Extended Data Table 1). An energy leakage flux can 
be added to the right-hand side of equation (3), and the model predicts 
a parallel shift of the acetate line to higher excretion rates (equation 
(S32) in Supplementary Information). This prediction was tested by 
titrating glucose uptake in a strain (NQ1313) expressing the leaky LacY 
mutant. As anticipated, a parallel shift to higher acetate excretion rates 
was obtained (purple triangles and line in Fig. 3d). Similar increases in 
acetate excretion were obtained with the addition of 2,4-dinitrophenol 
(DNP), which uncouples oxidative phosphorylation by carrying pro- 
tons across the cell membrane (Extended Data Fig. 5). A summary of 
quantitative comparisons between predictions of the proteome alloca- 
tion model with experimental findings is presented in Extended Data 
Table 3, showing that the model quantitatively captures the changes 
of acetate excretion patterns in response to the applied perturbations. 


Proteome cost of fermentation and respiration 

The theoretical predictions tested so far do not require the knowledge 
of the values of proteome cost parameters (for example, ¢, €,). However, 
these parameters are of central importance for theories based on pro- 
teome allocation. We have thus developed a coarse-graining approach 
to characterize the proteome cost for fermentation and respiration 
directly. First, the absolute protein abundance of individual proteins 
was obtained using quantitative mass spectrometry”® together with 
absolute abundance calibration by ribosome profiling**. Next, for each 
enzyme involved in glycolysis, TCA and oxidative phosphorylation, 
its abundance was partitioned among the three pathways, fermen- 
tation, respiration and biomass synthesis, in proportion to the three 
fluxes through the enzyme. Finally, the fractional enzyme amounts 
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partitioned into a pathway were summed up to obtain the total enzyme 
abundance devoted to the pathway (Fig. 4a; see Supplementary Note 3 
for details). In Fig. 4b, the energy production fluxes of fermentation 
and respiration (Extended Data Fig. 3b) are plotted against their 
respective proteome fractions determined in this manner. The line- 
arity of the results validates the linear dependences between Jers Jer 
and ¢, ¢, assumed in the model, while slopes of these lines directly 
yield the proteome efficiency of energy biogenesis for glycolytic carbon 
sources: €¢&750mM ATP per Agoonm per hour, and ¢,~390 mM ATP 
per Agoo nm per hour. Indeed, the proteome cost of fermentation (1/¢¢) 
is approximately twofold lower than that of respiration (1/¢,), quan- 
titatively validating the key assumption of this work. Together with 
similar procedures used to determine the other model parameters 
(as described in detail in Supplementary Note 1D), we obtained a 
self-consistent set of parameters (Extended Data Table 2) that success- 
fully recapitulates all our experimental data (Extended Data Fig. 3c, d). 
The proteome allocation model is able to predict not only acetate 
excretion patterns but also the expression of dozens of genes in the 
glycolysis and TCA pathways under different perturbation, as detailed 
in Supplementary Note 1C and Extended Data Figs 6 and 7. While both 
proteome limitation and energy dissipation lead to parallel shifts of the 
acetate line (Fig. 3a and c, respectively), this response arises from an 
opposite response of the energy sectors ¢, , as predicted by the model 
and verified by mass spectrometry. Under LacZ overexpression, cells 
decreased the expression of enzymes for both fermentation and respira- 
tion (orange lines in Extended Data Figs 6 and 7, as predicted in equa- 
tions (S26)-(S27) in Supplementary Information), while under energy 
dissipation, cells increased the expression of these enzymes (blue lines 
in Extended Data Figs 6 and 7, predicted by equations (S36)-(S37)). 


Discussion 
The notion that fermentation may be more proteome efficient than 
respiration was proposed previously by Molenaar et al.'“, extended to 
the use of the Entner-Doudoroff pathway by Flamholz et al.* and to 
the genome-scale by O’Brien et al.'”. Our study directly verifies this 
hypothesis (Fig. 4b), and establishes the pivotal role proteome efficiency 
has in determining the degree of overflow metabolism in E. coli (Fig. 3). 
Our findings in response to useless protein expression and energy dis- 
sipation are difficult to reconcile, even qualitatively, with alternative 
hypotheses such as the limitation of respiratory capacity®, the need for 
recycling of cofactors’, or constraints of the cytoplasmic membrane’’. 
Models with cell volume constraints!” are mathematically similar to pro- 
tein cost models; however, cell volume varies widely between growth 
conditions with similar densities*®, suggesting that it is not a constraint. 
Mechanistically, the re-uptake of acetate by acetyl-CoA synthase 
(ACS), upregulated by the cAMP receptor protein (CRP)-cAMP 
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Figure 4 | Partition of proteome fractions into flux components. a, The 
total abundance of all proteins devoted to glycolysis, TCA and oxidative 
phosphorylation (OXPHOS) (defined in Supplementary Note 3), is 
represented as a fraction of total protein for various degrees of lactose 
uptake in strain NQ381. Different colours indicate the proportions of these 
proteins dedicated to biomass production, fermentation and respiration, 
estimated using the corresponding fraction of fluxes; see Supplementary 


Proteome fraction (%) 


Note 3. b, Energy fluxes through the fermentation and respiration 
pathways are plotted against their respective proteome fractions. The 
lines are linear regressions of the data, with the slopes being the proteome 
efficiencies (pe and ¢,). The steeper slope (for fermentation) indicates 
higher ATP production per protein devoted to the pathway (lower protein 
cost), validating the central assumption of the proteome allocation model. 
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complex under carbon limitation”’, is important for the decrease of 
acetate excretion under carbon limitation!>1°. However, the linear 
growth-rate dependence of carbon overflow demands the tight global 
coordination of energy biogenesis pathways with biosynthesis, which 
cannot be accounted for by ACS activity alone, and requires the 
coordinated regulation of glycolytic and TCA enzymes. In particular, 
acetate excretion sharply increased under energy dissipation while the 
abundance of ACS proteins increased slightly as well (Extended Data 
Fig. 6). Moreover, the parallel increase of glycolytic and TCA enzymes, 
in combination with the increase in acetate excretion under energy 
dissipation, cannot be rationalized from the known actions of CRP 
regulation (Extended Data Figs 6 and 7), suggesting the role of addi- 
tional regulator(s). 

We have established how diverse patterns of acetate excretion can be 
understood as a part of a global physiological response used by E. coli 
to cope with changing proteomic demands of energy biogenesis and 
biomass synthesis under different growth conditions*”. Our findings 
can be used to guide approaches to minimizing overflow metabolism 
in synthetic biology applications****“!, in ways congruent with the 
fitness of the organism, for example, by reducing the expression of 
useless proteins (Fig. 3b). More broadly, a similar physiological ration- 
ale may underlie overflow metabolism in rapidly growing eukaryotes 
including tumour cells”””, in which the synthesis of mitochondria for 
TCA reactions is an additional cost. Indeed, scatter plots of ethanol 
production and sugar uptake in various strains of Saccharomyces cer- 
evisiae and other yeast species point to the existence of a universal 
response similar to that shown for E. coli!* (Fig. 1). The quantitative 
physiological approach developed in this work can be used as a model 
for characterizing metabolic efficiency and its biological implications 
in these systems and others. 


Online Content Methods, along with any additional Extended Data display items and 
Source Data, are available in the online version of the paper; references unique to 
these sections appear only in the online paper. 
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METHODS 
No statistical methods were used to predetermine sample size. 
Construction of LacY and LacY*!”’Y strains (NQ1312 and NQ1313). The Ptet- 
lacY region of the pZE12 Ptet-lacY plasmid*? was amplified with upstream and 
downstream primers including the digestion sites Xhol and BamHI, respectively, 
using the primers Ptet-F and lacY-R (see primers below). The resulting DNA frag- 
ment was used to replace the corresponding region of Ptet-gfp in the plasmid 
pZA31-gfp“4, yielding the plasmid pZA31 Ptet-lacY. This plasmid was transformed 
into the titratable PtsG strain NQ1243 to yield NQ1312. The same procedure was 
employed to generate the lacY“!””" mutant (that is, C531T), but fusion PCR was 
used to introduce a point mutation Val177 into the lacY sequence”. For this, two 
overlapping parts of the Ptet-lacY region were PCR amplified with the primers 
ptet-F, lacYfusion-R and lacYfusion-F, lacY-R (see primers below), in which the 
point substitution C531T leading to the Vall177 mutation from ref. 3 was included 
in the primers lacYfusion-F and lacYfusion-R. These two overlapping DNA frag- 
ments were fused together by PCR using primers ptet-F and lacY-R. The resulting 
Ptet-lacY fragment that carries the desired mutation was inserted into pZA31, 
yielding pZA31-lacY*'”’’. The resulting plasmids were transformed into the titrat- 
able PtsG strain NQ1243 to yield NQ1313. 
Construction of the flhD and fliA deletion strains (NQ1388 and NQ1539). 
The AfIhD deletion allele in strain JW1881-1 (E. coli Genetic Stock Center, Yale 
University), in which a kanamycin-resistance gene is substituted for the flhD gene, 
was transferred to the titratable PtsG strain NQ1243 after deletion of kanamycin 
resistance by phage P1 vir-mediated transduction. Similarly, the AfliA allele from 
strain JW1907 (KEIO collection‘), in which a kanamycin-resistance gene is sub- 
stituted for the fliA gene, was transferred to the titratable PtsG strain NQ1243 after 
deletion of kanamycin resistance by phage P1 vir-mediated transduction. 
Primers used in this study. The following primers for producing the new genetic 
constructs were used. ptet-F, 5’-ACACTCGAGTCCCTATCAGTGATAGAGAT 
TG-3’, was used for forward amplification of the Ptet sequence and included an 
Xhol digestion site for construction of pZE1 Ptetstab-lacZ, pZA31-lacY, pZA31- 
lacy*”%, lacY-R, 5/-TGTGGATCCTTAAGCGACTTCATTCACCTG-3/, was 
used for reverse amplification of lacY, lacY“!”’" and included a BamHI digestion 
site for construction of pZA31-lacY, pZA31-lacY“!”’V. lacYfusion-F, 5‘-CTCTG 
GCTGTGTACTCATCCTCGCCGTTTTACTCTTTTTCGCCAAAACGG-3’, 
was used for forward amplification of a fragment of lacY together with the reverse 
primer lacY-R. This DNA fragment was later used for fusion PCR to construct 
pZA31-lacY“!””Y, lacYfusion-R, 5’-CCGTTTTGGCGAAAAAGAGTAAAACGG 
CGAGGATGAGTACACAGCCAGAG-3’, was used for reverse amplification of a 
fragment of Ptet-lacY together with the forward primer ptet-F. This DNA fragment 
later was used for fusion PCR to construct pZA31-lacY“!”’V, 
Bacterial culture media. Our growth media were based on the MOPS-buffered 
minimal medium used previously*® with slight modifications. The base medium 
contains 40 mM MOPS and 4mM tricine (adjusted to pH 7.4 with KOH), 
0.1M NaCl, 10 mM NH,Cl, 1.32 mM KH>POg, 0.523 mM MgCh, 0.276 mM 
NazSOq, 0.1 mM FeSO, and the trace micronutrients described previously*”. 
For N-labelled media, !"NH,Cl was used in place of !NH4Cl. The concentrations 
of the carbon sources and various supplements used are indicated in the relevant 
tables. 

Batch culture growth has been described in detail previously’. 
Bacterial growth in the bioreactor. To measure CO) production from the bacterial 
growth, cells were grown in a Multifors bioreactor (Infors HT). Medium (400 ml) 
was used in a 750-ml vessel, which has an inlet for compressed air and out outlet 
for the exhaust gas. The vessel is otherwise closed except during brief period of 
sample collection. Samples of the cell culture (for reading Agoo nm, assaying lactose 
and acetate, etc) can be taken by using a syringe connected to the vessel. The air 
flow rate to the inlet was controlled by a mass flow controller (Cole-Parmer, 32907- 
67) and maintained at 400 ml min~!. The outlet was connected to a BlueInOne 
Cell sensor unit (BlueSens) for measuring CO concentration. The stir rate in 
the growth vessel was set as 800 r.p.m. and temperature was maintained at 37°C. 
Glucose assay. Samples (10011) were taken for at least eight different times during 
exponential growth (typically at A¢oo nm between 0.1 and 0.6) and immediately fro- 
zen. Before the assay, samples were thawed in water and immediately centrifuged 
at maximum speed (13,200g) for 2.5 min. Supernatant (7 11) was used to measure 
glucose concentrations using the Glucose Assay Kit (GAHK-20, Sigma-Aldrich). 
The slope of the plot of glucose concentrations versus A600 nm for all replicates 
(multiplied with the measured growth rate) was used to determine the glucose 
uptake rate. 
Lactose assay. To assay lactose, ~10 11 of the collected supernatant was first 
digested by 8-galactosidase (Sigma-Aldrich) in Z-buffer at 37°C for 20 min. The 
released glucose was then assayed enzymatically by the kit commercially available 
(Glucose Assay Kit, GAHK20; Sigma-Aldrich). As a control, the sample was treated 
in the same way without }-galactosidase. Little glucose was detected in the control. 
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Acetate assay. Samples (200 11) were taken for at least three different times during 
exponential growth (typically at A¢oo nm between 0.1 and 0.6) and immediately 
frozen. Before the assay, samples were thawed in water and immediately centri- 
fuged at maximum speed (13,200g) for 2.5 min. Supernatant (10011) were used 
to measure acetate concentrations using the Acetate Assay kit (10148261035, 
R-Biopharm). The slope of the plot of acetate concentrations versus A¢00 nm for all 
replicates (multiplied with the measured growth rate) was used to determine the 
acetate excretion rate. 

8-galactosidase assay. The assay was performed following a similar protocol as 
detailed in a previous study”). 

Proteomic mass spectrometry. Protein mass spectrometry samples were collected 
from the four bioreactor cultures, a water bath culture of equation (353) grown on 
glucose minimal medium, and two 15N-labelled water bath cultures of NCM3722 
on lactose minimal medium and NQ381 with 200|1M 3-methylbenzyl alcohol. 
For each of the cultures, 1.8 ml of cell culture at A¢oo nm = 0.4-0.5 during the expo- 
nential phase was collected by centrifugation. The cell pellet was re-suspended in 
0.2 ml water and fast frozen on dry ice. 

Sample preparation and mass spectrometry methods have been described 
previously”®. 

Protein identification. The raw mass spectrometry data files generated by the AB 
SCIEX TripleTOF 5600 system were converted to Mascot generic format (mgf) 
files, which were submitted to the Mascot database searching engine (Matrix 
Sciences) against the E. coli SwissProt database to identify proteins. The following 
parameters were used in the Mascot searches: maximum of two missed trypsin 
cleavage, fixed carbamidomethyl modification, variable oxidation modification, 
peptide tolerance 0.1 daltons (Da), MS/MS tolerance 0.1 Da, and 1+, 2+ 
and 3+ peptide charge. All peptides with scores less than the identity threshold 
(P=0.05) were discarded. 

Relative protein quantification. The raw mass spectrometry data files were con- 
verted to the .mzML and .mgf formats using conversion tools provided by AB 
Sciex. The .mgf files were used to identify sequencing events against the Mascot 
database. Finally, results of the Mascot search were submitted with .mzML files 
to our in-house quantification software“*. In brief, intensity is collected for each 
peptide over a box in retention time and m/z space that encloses the envelope for 
the light and heavy peaks. The data are collapsed in the retention time dimension 
and the light and heavy peaks are fit to a multinomial distribution (a function of 
the chemical formula of each peptide) using a least squares Fourier transform 
convolution routine’’, which yields the relative intensity of the light and heavy 
species. The ratio of the non-labelled to labelled peaks was obtained for each pep- 
tide in each sample. 

The relative protein quantification data for each protein in each sample mixture 
was then obtained as a ratio by taking the median of the ratios of its peptides. No 
ratio (that is, no data) was obtained if there was only one peptide for the protein. 
The uncertainty for each ratio was defined as the two quartiles associated with the 
median. To filter out data with poor quality, the ratio was removed for the protein 
in that sample if at least one of its quartiles lay outside of 50% range of its median. 
Furthermore, ratios were removed for a protein in all the sample mixtures in a 
growth limitation if at least one of the ratios has one of its quartiles lying outside 
of the 100% range of the median. 

Absolute protein quantification using spectral counting data. The spectral 
counting data used for absolute quantitation were extracted from the Mascot 
search results. For our N and !*N mixture samples, only the !4N spectra were 
counted. The absolute abundance of a protein was calculated by dividing the total 
number of '4N spectra of all peptides for that protein by the total number of 4N 
spectra in the sample. 

Absolute quantification of LacZ protein using purified LacZ protein as 
standard, and determination of the converting factor between Miller Unit 
and proteome fraction. For the condition of the LacZ overexpression strain 
(NQ1389) grown on glucose medium with zero chlorotetracycline level (see 
source data file of Fig. 2), 15N sample was prepared, that is, NQ1389 grown on 
glucose minimal medium with '"NH,Cl. The sample was mixed with a known 
amount of purified LacZ protein (Roche Diagnostics, 10745731001), the purity of 
which was verified both on a SDS-PAGE gel (where a single band was observed) 
and by checking the spectral counts of “N peptides in the sample (where ~99% 
of the '*N peptides are LacZ peptides). With the highly accurate relative protein 
abundance between the purified ''N LacZ and the '°N LacZ in the sample, the 
proteome fraction of LacZ in the sample was determined to be 3.3% + 0.3%. The 
average Miller Unit (MU) for the same condition was ~20,550 (see source data 
file of Fig. 2), leading to a converting factor of 1.6% of proteome fraction for 
10,000 MU. 

Uncertainty of individual measurements. Biological replicates show the following 
typical uncertainties in measured quantities: growth rate, ~5%; acetate excretion 
rates, ~15%; CO, evolution rate, ~5%. 
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Uncertainties of linear relations. The parameters and their associated stand- 
ard errors for linear relations were obtained by carrying out linear regression. 
Following our approach, multiple measurements over wide ranges of condi- 
tions from robust data sets revealing underlying relations between variables. 
The uncertainties are reported in Extended Data Tables 2 and 3, and throughout 
the text. 
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Extended Data Figure 1 | Acetate excretion data. a-f, The acetate 
excretion data is shown for E. coli cells grown in chemostat (a-e) and 
for cells growing on medium with non-glycolytic carbon sources (f). 

a, Glucose-limited chemostat data based on figure 1 from ref. 22. 

b, Glucose-limited and pyruvate-limited chemostat data from table 7 
of ref. 57. c, Glucose-limited chemostat data based on figure 3 of ref. 23. 
Only data with dilution rates less than the apparent washout dilution 
rate are plotted here. d, Glucose-limited chemostat data from table 1 

of ref. 15. e, Glucose-limited chemostat data based on figure 1 of ref. 16. 
f, E. coli K-12 NCM3722 was grown in minimal medium with one of five 
non-glycolytic carbon sources, including two gluconeogenic substrates 
(pyruvate and lactate), one substrate of the pentose phosphate pathway 
(gluconate), and two intermediates of the TCA pathway (succinate and 
fumarate). Deviation from the acetate line (the red line, as defined in 
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Fig. 1 and equation (1) of the main text) is seen most notably for pyruvate, 
which excretes a very large amount of acetate, and to a lesser degree, also 
lactate and gluconate, which enter glycolysis as pyruvate. In the framework 
of our model, these deviations result from different proteome efficiencies 
of fermentation and respiration on these carbon sources. Note: acetate 
excretion measurement was also attempted for growth on LB. However, 
growth on LB is not characterized by a single exponential steady-state 
growth phase, as various constituents of the medium are depleted during 
the course of batch culture growth. Assuming exponential growth for 

A600 nm data below 0.3 and alternatively from 0.3 for 0.5 gave doubling time 
of 18 min and 28 min, respectively. The corresponding acetate excretion 
rates were 14.3 and 3.6 mM Agoo nm! h!. These data should be regarded 
as semi-quantitative owing to the non-steady nature of growth on LB. 
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Extended Data Figure 2 | The (oxidative) fermentation and respiration 
pathways. a, Schematic illustration of the fermentation pathway, using 
glucose as an exemplary carbon source. The pathway is shown as the 
coloured part. One molecule of glucose is catabolized into two molecules 
of acetate and two molecules of CO) (not shown in the diagram), with 
four molecules of ATP generated via substrate phosphorylation and also 
four molecules of NADH produced. In the aerobic environment, NADH 
molecules can be converted into ATP molecules. The total number of 
ATP molecules produced per glucose molecule is therefore 4 + 4x, in 
which the conversion factor x indicates the number of ATP molecules 
converted from one NADH molecule (that is, ATP: NADH = x:1). In the 
illustration, we assume that two molecules of ATP are converted from one 


b 


8 ATP 4 NAD+ 
“2 ATP 
4. NADH 


Acetate 


4 NAD+ 
2 NADP+ 
2 FAD+ 


4 NADH 
2 NADPH 
2 FADH, 


2 ATP 


NADH molecule; that is, x =2. b, Schematic illustration of the respiration 
pathway, using glucose as an exemplary carbon source. The pathway 

is shown as the coloured part. One molecule of glucose is catabolized 

into 6 molecules of CO2 (not shown in the diagram), producing 

4 molecules of ATP, 6 molecules of NADH, 2 molecules of NADPH, and 
2 molecules of FADH). Using ATP:NADH = x:1, ATP:NADPH = x:1 and 
ATP:FADH) = x:2, we have the total number of ATP molecules produced 
as 4+ 9x for the respiration pathway. Here in the illustration, we assumed 
x =2. Note that the ratio of total ATP produced from respiration over total 
ATP produced from fermentation depends on the conversion factor x; 
that is, (4+ 9x)/(4+ 4x). The value of this ratio ranges from 1 (for x =0) 
to 9/4 (for x > ©). 
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Extended Data Figure 3 | Growth-rate dependence of acetate 
production and CO); evolution in bioreactor: data and comparisons to 
the model. a, The rate of CO, evolution was determined in a bioreactor 
setup for wild-type and titratable LacY cells (NCM3722 and NQ381, 
respectively) grown in lactose minimum medium with various degrees of 
lactose uptake titration, and the result was used to deduce the CO flux 
produced by respiration (blue circles). Also plotted (red squares) is the 
acetate excretion rate measured in the bioreactor. See Supplementary 

Note 2 for details of this experiment and corresponding analysis. The 
inducer levels, growth rate, measurements of glucose, acetate and CO), and 
the deduced CO; levels via respiration are shown in the table on the right. 
b, Deduced energy production fluxes from fermentation and respiration 
pathways, based on the measurements presented in a. Fluxes are in units of 
mM Agoonm | h-!. c, Comparison of model and experimental data. Using 
the set of parameters summarized in Extended Data Table 2, the model 
solution (equations ($14) and (S17)) satisfactorily describes the 
experimental data obtained for acetate excretion (J,.) and respiratory 
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CO) production (Jog, ,) in the bioreactor for carbon limitation. These 
results depend on the assumed ratios of ATP-carbon conversion. As 
described in Supplementary Note D1, the ratios we used in this work are 
ATP:NADH = 2:1, ATP:NADPH = 2:1 and ATP:FADH) = 1.15:1. Note that 
these conversion ratios have never been precisely measured and could be 
substantially overestimated!>. However, the central results presented in 
this work are robust with respect to the choice of these conversion ratios. 
As an illustration, we show in d that the model results generated with a 
very different set of conversion ratios (ATP:NADH =0.5:1, 

ATP:NADPH =0.5:1 and ATP:FADH) = 0.5:1) even provide a slightly 
better description of the data. (For these conversion ratios, the energy 
production of the cell matches the theoretical energy demand for biomass 
production.) The full model calibration requires the rate of CO, evolution, 
which can only be measured in a bioreactor setup. We note a small 
discrepancy between acetate fluxes and growth rates obtained for cultures 
grown in bioreactor as compared to batch cultures, possibly caused by 
differences in aeration. 
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Extended Data Figure 4 | The effect of useless protein expression on 
acetate excretion. a, Acetate excretion rate by strain NQ1389 is plotted 
against the absolute abundance of the expressed LacZ proteins, reported 
as a fraction of total protein (@z), for each of the four carbon sources 
described in Fig. 2a. The solid lines, depicting the linear decrease in 
acetate excretion, are model predictions (equation ($23) in Supplementary 
Information), with the lone parameter @max © 47% (the x-intercept of the 
line) determined from the least-mean-squares fit of the data in 3D plot of 
Fig. 2b, by a plane anchored to the acetate line. b, Alternatively, dmx can 
be determined from linear fits of growth rate versus oz for the four carbon 
sources shown. This results in @max = 42% + 5%. The solid lines in this 
panel are linear fits using @max = 42%. (We note that over a broad growth 
rate range, @max actually exhibits a growth-rate dependence (Dai, X. et al., 
manuscript in preparation). Nevertheless, over the narrow growth-rate 
range relevant for acetate excretion, this dependence is negligible. 

Hence, for the purposes of our paper, we consider @max to be constant.) 

c, A different view of the 3D plot in Fig. 2b. d, Glucose uptake rate as a 
function of growth rate under LacZ overexpression. The circles are the 
data and the dashed line is the best-fit to the data passing through 
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the origin. e, The relative protein levels of several representative genes, 
taken from amino acid synthesis (red), central metabolism (blue), protein 
synthesis genes (green), and nucleotide synthesis (black). As described 

in ref. 28, the vast majority of genes exhibited an expression pattern 

that is linearly proportional to the growth rate when growth is changed 
by increasing LacZ expression. f, Growth-rate dependence of motility 
proteins under carbon limitation. The proteome fraction data (green 
symbols) is from the carbon limitation series in ref. 28, in which growth 
rate was limited by titrating the lactose uptake for the strain NQ381. The 
motility proteins are proteins that are associated with the Gene Ontology 
(GO) term 0006810 (with GO name ‘locomotior) as defined by the 

Gene Ontology Consortium™. See ref. 28 for detailed description of the 
experimental procedure and data processing. Note that the fraction of 
motility proteins increases the most in the growth range where acetate is 
excreted. Also note that the energy consumption by chemotaxis comprises 
a very minor fraction of the total energy budget, estimated to be in the 
order of 0.1% (ref. 59). Disabling the motility function therefore does not 
affect the cell’s energy requirement. 
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Extended Data Figure 5 | Acetate excretion due to energy dissipation 
by DNP. DNP is a chemical known to dissipate membrane potential 
and thus imposes energy stress on the cell. Acetate excretion rates for the 
titratable glucose uptake strain (NQ1243) grown in medium with glucose 
and different concentrations of DNP were measured for different degrees 
of glucose uptake. The results are qualitatively similar to the data for the 


leaky LacY mutant NQ1313 (Fig. 3d), except for a systematic difference 
in the slopes of the resulting acetate lines (thin lines of different colours). 
The origin of this deviation is presumably a more complex action of DNP 
with additional effects on the cell as compared to the leaky LacY mutant. 
Indeed, it is known for instance that in addition to leakage of protons, 
DNP also causes leakage of osmolites through the membrane®’. 
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Extended Data Figure 6 | The relative expression levels of glycolysis 
proteins under proteome perturbation (by LacZ overexpression), and 
energy dissipation (by expressing leaky LacY). Orange data points and 
linear fits result from the overexpression experiment (that is, NQ1389 
grown in glucose minimal medium with different induction levels of 
LacZ expression), and blue data points and fits are the leaky LacY series 
(that is, the wild-type LacY strain NQ1312 and the leaky LacY strain 
NQ1313 grown in glucose minimal medium with 200M 3-methylbenzyl 
alcohol (3MBA)). The y axis denotes relative protein levels, which were 
obtained by mass spectrometry with the same reference for the two 

series (see Methods). The x axis is the growth rate (in units of h~'). The 
different trends of protein expression for the two series show the distinct 
nature of the two perturbations, demonstrating that these seemingly 
similar predictions for the acetate line (parallel shift to slow growth as 
shown in Fig. 3a, d) for the two perturbations, have distinct origins and 
exhibit distinctly different patterns of gene expression in accordance with 
model predictions derived in section C3 of Supplementary Note 1 (see 
equations (S26) and (S36)). From the perspective of gene regulation, it is 
not obvious what causes the increased expression levels of glycolysis genes 
under energy dissipation; the transcription factor Cra in combination 
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with the key central carbon intermediate fructose-1,6-bisphosphate 

(FBP) is recognized as the major regulator of glycolysis. FBP relieves 

the repression of glycolytic enzymes expression by Cra°?. The observed 
increase in the abundance of glycolytic enzymes under energy dissipation 
could be caused by a build-up of FBP, as energy stress limits protein 
polymerization. However, in this case, it is not clear what signalling 
pathway gives rise to the opposite responses of glycolytic enzymes to LacZ 
overexpression. Inset, corresponding glucose uptake and acetate excretion 
rates for the two perturbations presented in the main figure. Glucose 
uptake and acetate excretion rates decreased proportional to growth rate 
for LacZ overexpression (as expected from the model equations (S25), 
(S26) and (S29)). On the other hand, there was a marked increase in 
acetate excretion with energy dissipation for a roughly constant glucose 
uptake rate, as correctly anticipated by the model (equation (S36)). Note 
that the protein abundance of ACS (main panel) shows that this increase 
in the acetate excretion rate was not caused by a drop in ACS. Instead, the 
observed increase of acetate excretion, together with the parallel increase 
in the expression level of glycolysis and TCA enzymes, points to the 
coordination of glycolytic and TCA fluxes in response to energy demand. 


© 2015 Macmillan Publishers Limited. All rights reserved 


ARTICLE 


P AcCoA 
/ mdh 
15) 
1 
0.5 
0 OAA 
a 0.5 1 gltA : . 
+e fumA mdh al gItA 
1 MAL CIT 1 Lo 
0.5) 0.5 
% 0.5 1 fumA acnB °% 0.5 1 
fumB acnA 2 
acnB 
fumC aceB 1. 
FUM GLX ACN we = 
0 0.5 1 
sdhA sdhB acnB 
sdhC sdhD nee aun 
SUC ICIT 
sucD 
\ suet ‘ icd 
icd a8 
SucCoA AKG 1 
SS 0.5 
sucA sucB Ipd 0 
0 0.5 1 
i sucA sucB . Ipd 
1.5 1.5 1.5 
1 1 1 1 —"" 
0.5 0.5 0.5 0.5 
0! 0= 0 9 
0 0.5 1 0 05 1 0 05 1 0 05 1 


Extended Data Figure 7 | The relative expression level of TCA proteins 
under proteome perturbations (by protein overexpression) and energy 
dissipation (by using leaky LacY). Orange data points and linear fits 
represent the overexpression experiment (that is, NQ1389 with different 
induction levels of LacZ expression), and blue data points and fits are 

the leaky LacY series (using strains NQ1312 and NQ1313 with wild-type 
and leaky LacY expression, respectively). The y axis denotes relative 
protein levels, which were obtained by mass spectrometry with the same 
reference for the two series (see Methods). The x axis is growth rate 

(in units of h~'). The different trends of protein expression for the two 
series show the distinct nature of the two perturbations, demonstrating 
that these seemingly similar predictions on the acetate line (parallel 

shift to slow growth as shown in Fig. 3a, c) for the two perturbations, 
have distinct origins and exhibit distinctly different patterns of gene 


expression in accordance with model predictions derived in section 

C3 of Supplementary Note 1 (see equations (S27) and (S37)). From the 
perspective of gene regulation, the transcription factor CRP is thought to 
be the major regulator of TCA enzyme expression in aerobic conditions®. 
CRP-cAMP activity, which increases under carbon limitation, is known 
to activate the expression of most TCA enzymes. Together with the 
known role of CRP in upregulating the enzyme ACS which takes up 
acetate'>!®, CRP is considered to be a major candidate for regulating 
energy metabolism and acetate excretion. However, our findings that the 
expression of TCA enzymes increased under energy dissipation while 
acetate excretion also increased (Extended Data Fig. 6, inset) cannot be 
accounted for by known mechanisms of CRP regulation, and instead 
suggest an important role of additional regulators in the coordination of 
energy biogenesis pathways. 
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Extended Data Table 1 | Strains used in this study 


Strain Genotype 
NCM3722 | wild-type £. coli K12 strain 
NQ381 attB::Pytac-o1-XVIR, lacY::km-Pu-lacY 


Description 
parent strain for all strains used here 


titratable LacY 


NQ636 glpK g184t Glpk mutant 
NQ638 glpK a218t Glpk mutant 
NQ640 glpK g692a Glpk mutant 
NQ1243 ycaD::FRT:Ptet:xylR PptsG::kan:Pu:ptsG Titratable PtsG 


ycaD::FRT:Ptet:xylR PptsG::kan:Pu:ptsG; 


NQ1312 | pZA31 Ptet-lacY WT LacY control for NQ1313 
NQI313 yoann ee PptsG::kan:Pu:ptsG;  Ptet- isaicy LaeY wnatant 

NQ1388 ycaD::FRT:Ptet:xylR Pu:ptsG; AfthD-kan filhD deletion strain 

NQ1389 Ptet-tetR on pZA31; Ptetstab-lacZ on pZE1 LacZ over-expression strain 

NQ1539 ycaD::FRT:Ptet:xylR Pu:ptsG; AfliA-kan fliA deletion strain 

EQ353 wild-type E. coli MG1655 used in Li et al obtained from Jonathan Weissman lab 


Except for EQ353, all the strains used are derived from E. coli K-12 strain NCM3722 (refs 50-52) provided by the S. Kustu laboratory. Descriptions of the key strains used in this study are as follows. 


NQ1243: varying glucose uptake by titrating the expression of PtsG, a subunit of t 


e glucose PTS permease. The glucose PTS permease consists of two subunits, PtsG and Crr. Strain NQ1243 was 


constructed by replacing the ptsG promoter with a titratable Pu promoter from Pseudomonas putida. The activity of the Pu promoter is activated by the regulator XyIR upon induction by 3-methylbenzyl 


alcohol. Strain NQ1243 was grown in glucose minimal medium, supplemented wi 


h various 3-methylbenzyl alcohol levels (0-800 \1M) to stimulate XyIR and titrate the expression of PtsG. 


NQ381: varying lactose uptake by titrating the expression of Lacy. LacY (or lactose permease) is the primary transporter that allows E. coli to grow on lactose as the sole carbon source. Strain 
NQ381 was constructed by inserting the same titratable Pu promoter (above) between /acZ stop codon and /acY start codon. See ref. 27 details of strain construction. NQ1389: the titratable LacZ 


overexpression system. This strain carries two plasmids pZA31 and pZE1. The rep 
the pZE1 plasmid is driven by the modified tet-promoter (more stable with respec’ 


ressor TetR gene on the pZA31 plasmid is driven by the TetR-repressible Pitet.o1 promoter®S, while the lacZ gene on 
to spontaneous mutations). The combination of these two plasmids creates a stable, finely titratable system that can 


be induced via the addition of chlorotetracycline in the medium*. This induction system is tight, highly linear and capable of very high LacZ expression levels (with LacZ constituting up to 42% of the 


proteome) as seen in Fig. 2 and Extended Data Fig. 4. See ref. 28 for details of stra 


in construction. NQ1312 and NQ1313: strains containing plasmids expressing LacY and LacY*!”7V. The leaky LacY 


mutant (LacY4!77%)33 or the control wild-type Lacy is each driven by the Pttet-o1 promoter, harboured on the pZA31 plasmid. Neither strain contains a source of the TetR repressor, hence the plasmid 
expression system is fully induced. Bacteria use the H* gradient across its inner membrane generated from the electron transport chain to produce ATP using the ATP synthase complex. The leaky 


LacY protein allows protons to pass through the inner membrane of the cell, there 
efficiency or an increased energy demand on the bacterium. 


by ‘draining’ the membrane potential generated by energy production pathways. This in turn leads to reduced energy 
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Extended Data Table 2 | Model parameters calibrated from bioreactor 


Parameter sh nals From 
(Units) Description jiteredime Measured 


drmax=1-$ (%) | maximum energy proteome fraction extrapolated to A =0 


b (% hr) energy sector growth rate dependence 12.0+1.4° 


o(mM/OD) energy demand, proportionality constant with growth rate 45.7+2.8° 


er (1) carbon efficiency, fermentation 


e, (1) carbon efficiency, respiration 
é (mM/OD/hr) _| protein efficiency, fermentation 750+30' 
é (mM/OD/hr) | protein efficiency, respiration 390+10° 


carbon demand for biomass building blocks, h 
st 
prabeen) proportionality constant with growth rate i annie 
Pmax (1) maximum total proteome fraction P| 0.42+0.05' 


Sac (1) stoichiometric factor for acetate from fermentation | isi [| 
Sco2 (1) stoichiometric factor for CO, from respiration | ek [| 


*Determined from mass spectrometry data. Offset of linear function fitted to the total energy sector size in Supplementary Note 1, Fig. N6. Error inferred from this linear fit. See Supplementary Note 1, 
ection D4, for details. 

Determined from mass spectrometry data. Slope of linear function fitted to the total energy sector size in Supplementary Note 1, Fig. N6. Error inferred from this linear fit. See Supplementary Note 1, 
ection D4, for details. 
Determined from energy flux data, assuming energy demand proportional to growth rate. See Supplementary Note 1, section D3, and Fig. N5. 

Determined from the literature. Number of ATP produced per carbon processed in fermentation. A total of 4 ATP and 4 NADH molecules is produced per glucose molecule metabolized in 
fermentation (EcoCyc®5). On the basis of ref. 56, a conversion ratio of NADH, NADPH to ATP of 2.0 was assumed. Hence, the equivalent of 2.0 ATP molecules are produced per carbon metabolized in 
fermentation. 

®Determined from the literature. Number of ATP produced per carbon processed in respiration. A total of 4 ATP, 2 FADHz, 2 NADPH and 8 NADH molecules is produced per glucose molecule 
metabolized in respiration (EcoCyc®®). On the basis of ref. 56, a conversion ratio of NADH, NADPH to ATP of 2.0 and a conversion ratio of FADH> to ATP of 1.15 was assumed. Hence, the equivalent of 
4.4 ATP are produced per carbon metabolized in respiration. 

‘Determined from mass spectrometry data. Energy flux produced per protein fraction invested in the fermentation pathway. See Supplementary Note 1, section D5, for details. 

®Determined from mass spectrometry data. Energy flux produced per protein fraction invested in the respiration pathway. See Supplementary Note 1, section D5, for details. 

'Determined from carbon uptake flux and carbon fluxes by the energy pathways. See Supplementary Note 1, section D2, and Fig. N4 for details. 

‘Determined by LacZ overexpression. Given by the proteome fraction occupied by LacZ, at which the growth rate vanishes. According to the average and the corresponding standard deviation of the fits 
of the data presented in Fig. 2, growth rate vanishes at 260,000 + 30,000 MU, which translates into a proteome fraction of 42% + 5%, given that 100,00 MU corresponds to 1.6% of proteome. 

Note that this estimate is in good agreement with the estimates in previous works?!27.28, 

JHere, Sa-= 1/3 simply because of the chemical reaction 6C — 2acetate + 2CO2 of the fermentation pathway: the carbon uptake flux Jc measured in units of the number of carbon atoms (C), is three 
times of the flux of acetate molecules. 

‘Scon= 1/6 simply because of the chemical reaction 6C — 12COz of the respiration pathway, oxidizing all carbon atoms to CO2. 


oH 


aon 
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Extended Data Table 3 | Comparison between phenomenological model predictions and empirical results 


Parameter Description 


slope acetate line with constant LacZ expression 
level 
of 50 000 MU (or $, = 8%) 
threshold growth rate of acetate line with constant 
LacZ expression level of 50 000 MU (or ¢, ~ 8%) 


slope acetate line with constant LacZ expression 
level 
of 100 000 MU (or @, ~ 16%) 


threshold growth rate of acetate line with constant 
LacZ expression level of 100 000 MU (or 


, ~16%) 


threshold growth rate of acetate line with flagella Extended 
knockout Data Fig. 


i i 
chloramphenicol E0.02 


threshold growth rate of acetate line with 4mM i 
; +0.02 
chloramphenicol 


4Equation (S26) in Supplementary Information predicts direct proportionalities between growth rate \ and acetate excretion rates J,.' for LacZ overexpression. Hence, the lines should intercept the 
origin, with a vanishing acetate excretion rate Jac!/(A=0). 

®Intercept Jac’ (A= 0) of the least-mean-squares fit of a line to the experimental data for different levels of LacZ overexpression presented in Fig. 2a. 
°Equation (S30) predicts a slope identical to the standard acetate line for a constant level of LacZ overexpression. The model prediction is illustrated in Supplementary Note 1, Fig. N2a, and presented 
as the thin red lines in Fig. 3a. 
4For the four tested carbon sources G6P, glucose, mannitol and lactose, acetate excretion rates and $-galactosidase activities were fitted as linear functions of growth rate. These fits were then used to 
interpolate growth rates and acetate excretion rates for a fixed level of LacZ overexpression. This resulted in the four points for a fixed LacZ level from each of the different carbon sources presented in 
Fig. 3a. Resulting slopes and intercepts presented in this table are the result of least-mean-squares fits of lines to these points. 
“Equation (S30) in Supplementary Information predicts the threshold growth rate for a fixed amount of protein overexpression, using the parameter émax empirically determined in this work and 
previous works?!?78 as input. The model prediction is illustrated in Supplementary Note 1 Fig. N2a and presented as the thin red lines in Fig. 3a. 
‘Using equations (S14) and (S28) of Supplementary Information, assuming the proteome sector of motility proteins decreases linearly with growth rate, vanishing at A= 1.1 h~! and constituting 10% of 
the proteome at A= Aac (compare to Extended Data Fig. 4f). 

®Estimated from the data presented in Extended Data Fig. 4a. 

For chloramphenicol stress, equation (S14) in Supplementary Information predicts an increased slope Sac’ with an identical offset Sac! Aac’, 2S Compared to the standard acetate line. The model 
predictions (thin red lines, Fig. 3c) arise from using the offset of the standard acetate line given by SacAac as input and the slope of the modified acetate line as a fitting parameter. 

'Threshold growth rate determined from a least-mean-squares fit of a line to the data points presented in Fig. 3c. 

JEquation (S32) in Supplementary Information predicts an identical slope of the modified acetate line with energy dissipation as compared to the standard acetate line. The model prediction is 
illustrated in Supplementary Note 1, Fig. N3, and presented as the thin red line in Fig. 3d. 

‘Slope is the result of a least-mean-squares fit of a line to the data points presented in Fig. 3d. 
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Warm_-hot baryons comprise 5-10 per cent of 
filaments in the cosmic web 


Dominique Eckert!’, Mathilde Jauzac**, HuanYuan, Shan°, Jean-Paul Kneib*°, Thomas Erben’, Holger Israel?, Eric Jullo®, 
Matthias Klein’, Richard Massey’, Johan Richard® & Céline Tchernin! 


Observations of the cosmic microwave background indicate 
that baryons account for 5 per cent of the Universe’s total energy 
content!. In the local Universe, the census of all observed baryons 
falls short of this estimate by a factor of two”?. Cosmological 
simulations indicate that the missing baryons have not condensed 
into virialized haloes, but reside throughout the filaments of the 
cosmic web (where matter density is larger than average) as alow- 
density plasma at temperatures of 10°— 10’ kelvin, known as the 
warm-hot intergalactic medium?°. There have been previous 
claims of the detection of warm-hot baryons along the line of sight 
to distant blazars’-!° and of hot gas between interacting clusters!"4, 
These observations were, however, unable to trace the large-scale 
filamentary structure, or to estimate the total amount of warm- 
hot baryons in a representative volume of the Universe. Here we 
report X-ray observations of filamentary structures of gas at 
10” kelvin associated with the galaxy cluster Abell 2744. Previous 
observations of this cluster! were unable to resolve and remove 
coincidental X-ray point sources. After subtracting these, we find 
hot gas structures that are coherent over scales of 8 megaparsecs. 
The filaments coincide with over-densities of galaxies and dark 
matter, with 5-10 per cent of their mass in baryonic gas. This gas 
has been heated up by the cluster’s gravitational pull and is now 
feeding its core. Our findings strengthen evidence for a picture of 
the Universe in which a large fraction of the missing baryons reside 
in the filaments of the cosmic web. 

Abell 2744 is a massive galaxy cluster (containing a total mass 
of ~ 1.8 x 10° solar masses inside a radius of 1.3 Mpc; ref. 16) at a 
redshift of 0.306 (refs 17, 18). In its central regions, the cluster exhibits 
a complex distribution of dark and luminous matter, as inferred from 
X-ray and gravitational lensing analyses!®'*!°. Spectroscopic observa- 
tions indicate large variations in the line-of-sight velocity of different 
regions'”!8, Together, these observations reveal that the cluster is cur- 
rently experiencing a merger of at least four individual components, 
supporting the hypothesis that Abell 2744 may be an active node of 
the cosmic web. 

In December 2014, we obtained a 110ks observation of the cluster 
by the XMM-Newton X-ray observatory, covering the core and its sur- 
roundings out to a radius of ~4h7)_' Mpc, where hyp = Ho/(70 km s~! 
Mpc7'). We extracted a surface-brightness image of the observation, 
subtracting a model for the instrumental background and accounting 
for variation of the telescope efficiency across the field of view. Figure 1 
shows the resulting surface-brightness image in the 0.5-1.2 keV 
band obtained by combining the data from the three detectors of 
the European Photon Imaging Camera (EPIC) on board XMM- 
Newton. X-ray point sources were masked and the data were adap- 
tively smoothed to highlight the diffuse emission. The high sensitivity 
achieved during this observation, thanks to a minimal number of solar 


flares, allowed us to identify several previously unreported features. 
Near the virial radius of the cluster (~2h79 ! Mpc) and beyond, several 
high-significance (>6) regions of diffuse emission are detected and 
appear to be connected to the cluster core. To confirm this connection, 
we extracted the X-ray emissivity profile of the cluster by masking the 
regions of excess emission, and compared the resulting profile with the 
emissivity profile in the sectors encompassing the filamentary struc- 
tures (see Extended Data Fig. 1). Although the emissivity of the cluster 
falls below the detectable level at ~2h7) ! Mpc from the cluster centre, 
we observe significant emission in sectors extending continuously to 
the edge of the XMM-Newton field of view, that is, roughly at 4h7) 
Mpc in projection from the core. This shows that the detected fea- 
tures are very extended and not caused either by the superposition of 
unresolved point sources or by individual group-scale haloes. These 
structures are not visible at higher energies (2-7 keV), in contrast to 
the cluster core. This suggests that the gas observed in the structures is 
cooler than that of the central regions. 

To identify the structures detected in X-rays, we used a collec- 
tion of published spectroscopic redshifts within the XMM-Newton 
field of view. Spectroscopic redshifts are available for 1,500 galaxies 
in the field'”!8. We selected galaxies with velocities falling within 
+5,000kms~! of the cluster mean to capture the cluster and its 
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Figure 1 | Map of the hot gas in and around the galaxy cluster Abell 
2744, Shown is the XMM-Newton/EPIC surface-brightness image of the 
galaxy cluster Abell 2744 in the 0.5-1.2 keV band. The colour bar indicates 
the brightness in units of erg cm~* s~' arcmin’. The green circle shows 
the approximate location of the virial radius Ryir¥2.1h7) | Mpc. The white 
ellipses highlight the position of diffuse structures discovered here. 
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Figure 2 | Comparison between the distribution of hot gas and galaxies 
in the region surrounding Abell 2744. Shown is the XMM-Newton image 
of Abell 2744 (same data as Fig. 1); also shown are the positions of member 
galaxies with spectroscopic redshift within +5,000 km s7! of the cluster 
mean! (red dots); red curves show galaxy number density contours. 


accretion region in their entirety. In Fig. 2 we show the XMM-Newton 
brightness image together with the position of selected cluster mem- 
bers and galaxy density contours. Concentrations of cluster galaxies are 
found coincident with the four hot-gas filamentary structures labelled 
E, S$, SW and NW in Fig. 1. Conversely, structure N corresponds to a 
background galaxy concentration at redshift z~ 0.45, whereas the gal- 
axies associated with the SE substructure exhibit a substantial velocity 
difference of —8,000kms~! compared to the cluster core. This velocity 
difference corresponds to a large projected distance from the cluster, 
which indicates that, although it is part of the same superstructure, this 
system is probably not interacting with the main cluster. We therefore 
consider the association of the SE structure with the Abell 2744 com- 
plex as tentative and ignore it for the remainder of the analysis. As a 
result, we only associate structures E, S, SW and NW with the accretion 
flow towards Abell 2744. Structures S+SW and NW have already been 
identified as galaxy filaments on the basis of the galaxy distribution!>””. 
The average redshift of the galaxies in the E, S and NW structures is 
consistent with that of the main cluster (see Table 1), indicating that 
these filamentary structures are oriented close to the plane of the sky. 
To map the distribution of total mass around the cluster, we meas- 
ured the weak and strong gravitational lensing of background galaxies 
visible in wide-field optical images from ground-based telescopes and 
in ultra-deep Hubble Space Telescope (HST) imaging of the cluster 
core”". Our identification of cluster member galaxies utilizes a photo- 
metric galaxy catalogue based on Canada-France-Hawaii Telescope 
(CFHT) data in the i’ optical wavelength band and deep, archival data 


Table 1 | Properties of the filaments discovered in this study 
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Figure 3 | Hot gas, visible light and total mass in Abell 2744. Shown is 
the CFHT image of Abell 2744 and the surrounding large-scale structure. 
The contours show X-ray isophotes (blue), mass distribution reconstructed 


from combined strong and weak lensing (white), and optical light 
(dashed red). 


3.50 3.30 


from the Wide-Field Imager (WFI) on the ESO 2.2-m telescope in the 
B, V and R bands. We selected cluster members and background gal- 
axies using their colours in the BVRi wavelength bands”, and used the 
shear signal measured from a combination of HST and CFHT images 
for the weak lensing analysis. We used both a simple inversion method 
and a combined parametric and non-parametric optimization to recon- 
struct the weak lensing signal. We found that all the substructures iden- 
tified by XMM-Newton coincide with peaks in the matter distribution, 
as shown in Fig. 3. We then used the weak lensing information to infer 
an estimate of the mass of the structures detected in X-rays. The total 
mass within the identified substructures is given in Table 1. Given that 
dark matter dominates the total mass budget, we conclude that the 
structures reported here correspond to overdensities in both the baryon 
and dark-matter distribution. 

Wide-field galaxy redshift surveys have shown that the large-scale 
distribution of matter in the Universe is not homogeneous”?”4, Instead, 
matter tends to fall together under the action of gravity into filamentary 
structures, forming the cosmic web**”°. Galaxy clusters, the largest 
gravitationally-bound structures in the Universe, form at its nodes, 
where the matter density is the highest. We therefore associate the 
structures discovered here with intergalactic filaments and conclude 
that Abell 2744 is an active node of the cosmic web. 

We estimated the plasma temperature in all the filaments highlighted 
in Fig. 1 by extracting their X-ray spectra and fitting them with a thin- 
plasma emission model. The gas in the structures has a typical density 
of a few times 107° particles per cm*, corresponding to overdensities 
of ~200 compared to the mean baryon density”®. Approximating 


Region <z> T (10° kK) Megas (h~! Mo) SNR X-ray Mot (h~! Mo) SNR lensing feas 

E 0.308 15+2 (3.8+0.6) x 10!4 15.4 (7.9+2.8) x 1018 3.1 0.05 +0.02 

S) 0.303 1642 (7.1 40.8) x 1014 22.6 (9.5+2.4) x 1018 6.8 0.07 £0.02 

Sw 0.305 3th (2.0+0.4) x 1014 9.6 (4.8+1.7) x 1018 3.1 0.04+0.02 

NW1 0.305 2544 (5.7 £0.3) x 1014 25.3 (9.5+2.7) x 1018 5.2 0.06 +0.02 

NW2 0.305 19+2 (1.940.1) x 1018 25.9 (1.2+40.3) x 1014 3.3 0.15+0.04 
X-ray and lensing properties of the regions defined in Extended Data Fig. 2. Note that because of the uncertainty in the geometry of the filaments, the provided gas mass (Mgas), total mass (Mot) and 
gas fraction (f) should be considered as indicative. The masses reported here were obtained by combining strong and weak lensing. A comparison with weak-lensing-only measurements is provided in 


Extended Data Table 2. SNR, signal to noise ratio. Ma, solar mass. 
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the geometry of the filaments as segments of cylinders, we estimate 
the total gas mass enclosed within the filaments to be considerable 
(~4 x 10}3 solar masses). Given the mass within the filaments obtained 
from weak lensing, we estimate a gas fraction between 5% and 15% 
for the various substructures, depending on the adopted mass recon- 
struction method (see Table 1), which represents a large fraction of the 
Universe’s baryon fraction of 15% (ref. 1). The plasma temperature is 
in the range (10-20) x 10° K for the various filaments (see Table 1). 
This is substantially less than the virial temperature of the cluster 
core (~10° K), which indicates that the plasma has not yet virialized 
within the main dark-matter halo. These gas temperatures and den- 
sities correspond to those expected for the hottest and densest parts 
of the warm-hot intergalactic medium (WHIM)**?7?8, Numerical 
simulations predict that the bulk of the gas permeating intergalactic 
filaments should have temperatures in the range 10°°-10°° K, but 
the gas in the vicinity of the cluster may have undergone substantial 
heating caused by adiabatic compression and shock heating. Note also 
that the temperatures measured here may be overestimated, given that 
X-ray telescopes are sensitive preferentially to the hottest phase of the 
expected gas distribution. Overall, these properties support the picture 
in which a large fraction of the Universe's baryons are located in the 
filaments of the cosmic web. 


Online Content Methods, along with any additional Extended Data display items and 
Source Data, are available in the online version of the paper; references unique to 
these sections appear only in the online paper. 
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METHODS 

Imaging X-ray analysis. Abell 2744 was observed by XMM-Newton in late 2014 
for a total observing time of 110ks (PI, J.-P.K.; OBSID, 074385). At the redshift 
of Abell 2744 (0.306), the size of the XMM-Newton field of view corresponds to 
8h79 | Mpc. We processed the data using the XMM-Newton Scientific Analysis 
System (XMMSAS) v14.0. We excluded flaring periods from the event files by 
creating a light curve for each instrument separately and filtering out the time 
periods for which the observed count rate exceeded the mean by more than 2c. 
The observation was very mildly affected by soft-proton flares, allowing us to 
reach a flare-free observing time of 96 ks, 97 ks and 87 ks for EPIC detectors MOS1, 
MOS2 and pn, respectively. 

We extracted raw images in the 0.5-1.2 keV band for all three EPIC detectors 
using the Extended Source Analysis Software (ESAS) package”’. This energy band 
maximizes the source-to-background ratio and avoids the bright Al and Si back- 
ground emission lines, while maintaining a large effective area since the collecting 
power of the XMM-Newton telescopes peaks at 1 keV. Exposure maps for each 
instrument were created, taking into account the variations of the vignetting across 
the field of view. A model image of the non X-ray background (NXB) was com- 
puted using a collection of closed-filter observations, and was adjusted to each indi- 
vidual observation by comparing the count rates in the corner of the field of view. 
X-ray point sources were detected using the XMMSAS tool ewavelet and masked 
during the analysis. Additionally, we used the existing Chandra observations of 
the cluster! to detect point sources down to fainter X-ray fluxes (~5 x 10716 
erg cm *s') and mask the corresponding areas. Such a flux threshold for point- 
source removal corresponds to a resolved fraction of 80% of the cosmic X-ray 
background*”, which is associated with a cosmic variance of about 5%. This ensures 
that the extended features reported here are indeed caused by diffuse emission. 

We computed surface-brightness images by subtracting the NXB from the raw 
images and dividing them by the exposure maps. To maximize the signal-to-noise 
ratio (SNR), we then combined the surface-brightness images of the three EPIC 
detectors by weighting each detector by its relative effective area. The resulting 
image was then adaptively smoothed using the XMMSAS tool asmooth, requiring 
an SNR of 5 for all features above the local background. The total XMM-Newton/ 
EPIC image of Abell 2744 is shown in Fig. 1. 

To confirm the presence of the filamentary structures shown in Fig. 1, we com- 
pared the surface brightness of the regions inside and outside the filaments. We 
used the PROFFIT code"! to extract the surface brightness profile from the sur- 
face-brightness peak by masking the sectors corresponding with the filaments, and 
we compared the masked profile with the surface brightness profile in the direction 
of the filaments, that is, in the sectors including the filaments (position angles 
10°-70°, 150°-180° and 260°-300° for the NW, E and S filaments, respectively, 
where 0° is the W direction; see Extended Data Fig. 2). In Extended Data Fig. 1 we 
show the corresponding surface-brightness profiles. When masking the filaments, 
no statistically significant cluster emission is detected beyond 7 arcmin (~2hy! 
Mpc); in the direction of the filaments, a flat surface brightness is observed out to 
the edge of the field of view (~4h7) | Mpc). The small variations in the amplitude 
of the surface-brightness profiles indicates that the emission is due to filamentary 
structures rather than to a collection of infalling clumps. The excess emission 
produced by the filaments has already been noted in Suzaku observations of the 
cluster’; the poor angular resolution and narrow field of view of Suzaku were how- 
ever insufficient to separate the filaments from the field and resolve point sources. 

For comparison, we extracted radial profiles of galaxy density from spectroscop- 
ically-confirmed members'* in exactly the same sectors. The resulting profiles are 
shown in Extended Data Fig. 3. We find that beyond the cluster’s virial radius the 
galaxy density is consistently larger in the regions containing the filaments com- 
pared to the perpendicular directions, which highlights the association between 
the structures detected in X-rays and the local galaxy distribution. 

Spectral X-ray analysis. We performed a spectral analysis of the structures high- 
lighted in Fig. 1. We defined elliptical regions following the X-ray isophotes as 
closely as possible. In Extended Data Fig. 2 we show the regions used to derive the 
spectral properties of the filaments. Since the surface brightness of these regions 
barely exceeds the background level, a detailed modelling of all the various back- 
ground components is necessary to obtain reliable measurements of the relevant 
parameters. We adopted the following approach to model the various spectral 
components”. 

The source. We modelled the diffuse emission in each region using the thin- 
plasma emission code APEC’, leaving the temperature and normalization as free 
parameters. The metal abundance Z was fixed to 0.2Z¢ (ref. 34). This component 
is absorbed by the Galactic column density, which we fixed to the 21-cm value 
(Ny = 1.5 x 107° cm~?; ref. 35). 

The non-X-ray background (NXB). We used closed-filter observations to estimate 
the spectrum of the NXB component in each region”. Instead of subtracting the 


NXB, we modelled it using a phenomenological model and included it as an addi- 
tive component in the spectral fitting. This method has the advantage of retaining 
the statistical properties of the original spectrum. We left the normalization of 
the NXB component free to vary during the fitting procedure, which allows us to 
take variations of the NXB level into account. The normalization of the prominent 
background lines was also left free. Since the observation was very weakly contam- 
inated by soft proton flares, the residual soft proton component can be neglected. 
The sky background components. We used 4 offset regions where no cluster emission 
is detected (see Extended Data Fig. 2) to measure the sky background compo- 
nents in the field of Abell 2744. We modelled the sky background using a three- 
component model: (i) a power law with photon index fixed to 1.46 to model the 
cosmic X-ray background (CXB)); (ii) a thermal component at a free temperature 
to estimate the Galactic halo emission; and (iii) an unabsorbed thermal component 
at 0.11 keV for the local hot bubble. The best-fit spectrum for the Offset 1 region 
is shown in the top-left panel of Extended Data Fig. 4. In Extended Data Table 1 
we show the best-fit parameters for our sky background model in the four offset 
regions. The variation of the parameters from one region to another allows us 
to estimate the systematic uncertainties associated with the variation of the sky 
background across the field of view. The main sky component (the CXB) typically 
varies by +10% across the field. Slightly larger variations (~20%) are observed for 
the foreground components, although it must be noted that the normalizations of 
the Galactic halo and local bubble components are correlated. The overall values 
of these parameters agree well with previous measurements of the CXB* and the 
foregrounds*”. 

We note that because of strong Galactic absorption in the far-ultraviolet band 
and falling effective area in this wavelength range, XMM-Newton is sensitive pre- 
dominantly to the hottest phase of the gas (T > 10°° K). To test the sensitivity of 
our observations to cooler plasma, we assumed a differential emission-measure 
model including gas temperatures in the range 10°°-10’ K and simulated an XMM- 
Newton spectrum at the same depth as our observation. The resulting spectrum 
can be well fitted with a single-temperature model at T= 10°* K. This indicates 
that the temperatures measured here may be substantially overestimated if the 
plasma is multiphase. 

In Extended Data Fig. 4 we show the observed spectra for the five regions 
defined in Extended Data Fig. 2 together with their best-fit model. Since it is the 
brightest and most extended, the NW filament was split into two regions (labelled 
NW1 and NW2) to study the variation of the spectral parameters along a single 
filament. The resulting parameters are provided in Table 1. To estimate the gas mass 
within each filamentary structure, we modelled the emission region as a cylinder 
with length and diameter given by the major and minor axes of the defined ellipses, 
respectively. We converted the measured normalization into an emission measure, 
and computed the average gas density assuming constant density in each structure. 
We estimated the gas mass by integrating the resulting gas density over the vol- 
ume (see Table 1). We note that given the large uncertainties in the 3D geometry 
of the filaments, the recovered gas densities and masses should be considered as 
indicative. Indeed, we tested the effect of adopting different geometries (spheres, 
ellipsoids) on the recovered gas mass and gas density, and found that the results 
obtained with the various geometries vary by ~30%. 

To assess the level of systematic uncertainties in our spectral measurements, we 
used the spectrum of the SW region, as it is the weakest and thus is the most prone 
to systematic uncertainties, and let the various sky background and NXB param- 
eters vary within their allowed ranges. We then applied a Markov chain Monte 
Carlo (MCMC) algorithm to sample the likelihood distribution. The posterior 
distribution for the measured parameters are then marginalized over the system- 
atic uncertainties associated with the variation of the background components. 
Through this approach, we found a typical systematic uncertainty of ~ 20% on 
the gas temperature and <5% on the emission measure. These values provide an 
upper limit to the level of systematic uncertainties in the other regions since the 
intensity of the source relative to the background is higher than for the analysis 
carried out here. 

Analysis of ESO and CFHT optical data. We used the colours of galaxies in archi- 
val optical imaging of the Abell 2744 field to identify members of the cluster and its 
associated filaments. We constructed a photometric catalogue from observations 
obtained in the B,V and R filters using the WFI instrument at the ESO 2.2-m tele- 
scope at La Silla Observatory, combined with i-band data obtained with MegaCam/ 
MegaPrime at the CFHT. For the WFI BVR filters, we were able to use existing 
co-added images (B, 9,200; V, 8,700, R, 21,000s) from a weak lensing follow-up 
of clusters detected in the Sunyaev—Zeldovich (SZ) effect. Observations spanning 
three campaigns between September 2000 and October 2011 were bias-subtracted 
and flat-fielded using the THELI processing pipeline***?. THELI also includes 
astrometric, relative and absolute photometric calibration. Finally, the CFHT 
i-band data obtained in July 2009 were reduced using the CFHT-specific THELI 
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adaptation developed and applied for the CFHTLenS project’. For all filters, the 
co-added images were post-processed, and saturated stars and otherwise unreliable 
image areas were masked out’, Source catalogues were distilled from the co-added 
images using the weak lensing pipeline from ref. 22. Because of the different field- 
of-view of the cameras involved (34! x 34’ for WFI versus 60’ x 60’ for CFHT 
MegaCam), it proved useful to adopt the following strategy: we measured source 
photometry in all three WFI passbands in one go, making use of the double detec- 
tion mode in SEXTRACTOR™, with the deep R-band data as the detection image. 
In order to obtain consistent magnitudes, photometric quantities were measured 
after having matched the seeing in the other filters to the poorest seeing among 
them. A separate detection run was performed for the CFHT i’-data. The output 
catalogues were merged, identifying as the same object sources detected in WFI 
and CFHT within 0.5 arcsec of each other, yielding a common photometric cata- 
logue containing 37 WFI galaxies per square arcmin. Objects were categorized as 
stars or galaxies based on their apparent size and magnitude. 

Lensing analysis of HST and CFHT data. Lensing constraints from the HST field 
of view. The strong lensing constraints used to model the inner core of Abell 2744 
consist of a set of 51 multiply-imaged systems (159 images'*). The weak lensing 
catalogue for the HST field of view was built following the methods described 
in ref. 43, and the details of the Abell 2744 weak-lensing catalogue will be given 
elsewhere (M.J. et al., manuscript in preparation). Here we give a brief summary 
of the different steps. 

The weak lensing analysis is based on shape measurements in the Advanced 
Camera for Surveys (ACS)/F814W band. Following a method developed for the 
analysis of data obtained for the COSMOS survey“, the SEXTRACTOR photom- 
etry package’? was used for the detection of the sources. The resulting catalogue 
was then cleaned by removing spurious sources, duplicate detections, and any 
sources in the vicinity of stars or saturated pixels. Finally, to overcome the pattern- 
dependent correlations introduced by the drizzling process between neighbouring 
pixels, we simply scaled up the noise level in each pixel“ by the same constant 
FA 0.316 (ref. 45). 

Since only galaxies behind the cluster are gravitationally lensed, the presence of 
cluster members dilutes the observed shear and reduces the statistical significance 
of all quantities derived from it. Therefore, the identification and removal of the 
contaminating unlensed galaxies is crucial. Thanks to the HST data in three bands 
(F814W, F606W and F435W), we identified the foreground galaxies and cluster 
members using a colour-colour diagram’. The measure of galaxy shapes was 
done using the Rhodes-Refregier-Groth (RRG) method", adapted to multi-epoch 
images like the one coming from the HSTFF data of Abell 2744”. Finally, galaxies 
with ill-determined shape parameters were excluded, since these galaxies do not 
contribute substantially to the shear signal?’ 

Lensing constraints from the CFHT field of view. We employed the popular Kaiser- 
Squires-Broadhurst (KSB) method for galaxy shear measurement**. We modelled 
the observed galaxy shape as a convolution of the (sheared) galaxy with the point 
spread function (PSF), which is itself modelled as a circular profile convolved with 
a small anisotropy. For the PSF modelling, we identified stars in the size~-magnitude 
and lmax—magnitude planes chip by chip*’, where /imax is the peak surface bright- 
ness. We then measured the Gaussian-weighted shape moments of the stars, and 
constructed their ellipticity. In addition to cuts in /4max and magnitude, we also 
excluded noisy outliers with SNR <100 or absolute ellipticity more than 2c away 
from the mean local value, and we iteratively removed objects very different from 
neighbouring stars. Having obtained our clean sample of stars, a second-order 
polynomial model in x and y was used to model the PSF across the field of view. The 
ellipticity of the PSF changes from its core to its wings. We measured the PSF shape 
using weight functions of different sizes and, when correcting each galaxy, used 
the weight function of the same size to measure the shapes of both the PSF and the 
galaxies. Background galaxies were selected with the magnitude cuts 20 < i! <26, 
size cuts 1.15rpsp <1 < 10 pixels (where 7 is the half-light radius and rpgp is the size 
of the largest star), SNR > 10 and SEXTRACTOR flag FLAGS = 0. After masking 
and catalogue cuts, the galaxy number density is ~ 10 galaxies per square arcmin. 
We then measured the shapes of all the selected galaxies. Our implementation of 
KSB is based on the KSBf90 pipeline”’. Details of the calibration and systematic 
effects are shown and discussed elsewhere”. If the PSF anisotropy is small, the shear 
ycan be recovered to first order from the observed ellipticity e°’ of each galaxy via 


= pm 
y= P {* 7 — Je 


where asterisks indicate quantities that should be measured from the PSF model 
interpolated to the position of the galaxy, P*™ is the smear polarizability, and P, is 
the correction to the shear polarizability that includes smearing with the isotropic 
component of the PSE. The ellipticities were constructed from a combination of 
each object’s weighted quadrupole moments, and the other quantities involve 
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higher-order shape moments. All definitions are taken from ref. 50. Note that we 
approximate the matrix P, by a scalar equal to half its trace. Since measurements 
of Tr P, from individual galaxies are noisy, we fit it as a function of galaxy size and 
magnitude, which are more robustly observable galaxy properties”. 

The weight of the shear contribution from each galaxy is defined as 


where ,, is the error for an individual galaxy obtained via the formula in Appendix 
A of ref. 51, and 09 + 0.3 is the dispersion of the intrinsic ellipticities of galaxies. 
With the help of the shear catalogue, we then estimated the total mass within the 
filaments. As the weak lensing effect is not very sensitive to the mass profile, we 
assumed a dual pseudo isothermal elliptical (dPIE) profile centred on the X-ray 
position to measure the total mass of the filament candidates using the parametric 
model-fitting algorithm LENSTOOL™. As the weak lensing effect is not very sensi- 
tive to the mass profile, we also tested the accuracy of the derived masses by fitting 
again the shear profile with an elliptical Navarro-Frenk- White (NFW) profile with 
a concentration c= 1. The measured masses are consistent within the uncertainties. 
Lensing mass model. The mass model built for this analysis used strong and 
weak lensing constraints, combining parametric and non-parametric approaches 
to model the global mass distribution*’. The details of the mass modelling will be 
given elsewhere (MJ. et al., manuscript in preparation). We kept the parametric 
model built for the strong lensing analysis of Abell 2744 fixed to their best-fit 
values, and we modelled the surrounding mass distribution using a multi-scale 
grid drawn from a prior light distribution of the cluster using the WFI multi-band 
photometric catalogue. The nodes of the grid model were parameterized using 
Radial Basis Functions (RBFs™). This allowed us to appropriately weight the strong 
lensing constraints without taking them twice into account™. 

The strong lensing parametric model was composed of two cluster-scale haloes. 
The multi-scale grid was composed of 10,282 RBFs, for which only the ampli- 
tude was left free while fitting. To the 733 cluster members identified in the HST 
fields of view, we added 1,457 cluster members identified using a standard colour- 
magnitude selection using B, V and R bands coming from WFI observations to 
identify the red-sequence galaxies of the cluster. Galaxy-scale haloes were modelled 
as RBFs, using dPIE potentials. The resulting mass map is shown by the white 
contours in Fig. 3. 

We sampled the parameter space in LENSTOOL using the Bayessys Library 
implemented in LENSTOOL™. The objective function is a standard likelihood 
function in which noise is assumed to be Gaussian. LENSTOOL returns a large 
number of MCMC samples, from which we estimate mean values and uncertainties 
in the mass density field. In Extended Data Fig. 5 we show the radial surface mass 
density profile for the cluster average compared to the sectors encompassing the 
filaments (same as for Extended Data Fig. 1). An excess lensing signal is observed 
in the direction of the filaments compared to the radial average. The masses 
obtained using this technique are given in Table 1. In Extended Data Table 2 we 
show the masses and SNRs obtained using this method (hybrid LENSTOOL) and 
the direct inversion method described above (KSB) for the various filaments. The 
results of the two methods agree within the uncertainties. The differences observed 
between one method and the other allow us to quantify the level of systematic 
uncertainties associated with the lensing reconstruction using the existing data. 
Sample size. No statistical methods were used to predetermine sample size. 
Code availability. The PROFFIT code for X-ray surface brightness analysis is 
available at http://www.isdc.unige.ch/~deckert/newsite/Proffit.html. The THELI 
data reduction scheme for CFHT and ESO/WFI data can be downloaded at https:// 
www.astro.uni-bonn.de/theli/. The gravitational lensing code LENSTOOL can be 
found at http://projets.lam.fr/projects/lenstool/wiki. The KSBf90 code used for 
weak lensing is available at http://www.roe.ac.uk/~heymans/KSBf90/Home.html. 
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Extended Data Figure 1 | Radial X-ray emissivity profiles in the surface brightness in the sectors NW (northwest, position angle 10°-70°), 
filaments and in the cluster. Shown are XMM-Newton/EPIC surface- E (east, 150°-180°) and S (south, 260°-300°). Uncertainties (error bars) are 
brightness profiles (Sx); black, obtained by masking the filaments; colours, given at the lo level. 
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Extended Data Figure 2 | Regions used for the analysis of the 
thermodynamic properties of the filaments. The 0.5-2 keV surface 
brightness level is colour coded (bar at right; units are erg s~' cm~* 
arcmin *); right ascension and declination are in degrees. Spectra were 
extracted from the regions indicated as E, S, SW, NW1 and NW2 by the 


white ellipses. The green circles show the regions labelled as Offset1-4 
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used to estimate the local background components (see Extended Data 
Table 1). The dashed cyan sectors show the regions used to extract the 
radial profiles along the filaments for Extended Data Figs 1, 3 and 5. The 
grey ellipses show background/foreground structures masked during the 
analysis (see text). 
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Extended Data Figure 3 | Radial galaxy density profiles in the filaments as Extended Data Fig. 1) are compared to the galaxy density of the cluster 


and in the cluster. Galaxy density profiles (N,a1) using spectroscopically obtained by masking the filaments (black). Uncertainties (error bars) are 
confirmed cluster members in sectors encompassing the filaments (same given at the lo level. 


© 2015 Macmillan Publishers Limited. All rights reserved 


LETTER 


a) Background 


normalized counts s~! keV! 
normalized counts s~! keV~! 


Energy (keV) Energy (keV) 
Cc) South d) South-West 
ee | 
6 ri | f fr ly | 
f ' ii ! 
iN, | 
a / ‘\ i | F 
a NZ | Hy Ms g 
E = 
My et |e i | te 
Se = | 4 
Sc. = MS , Aviluit ag 
: \ ese 7 Uae | ) bull 
| \ ik il 1 i rf 
E \ * | E 
s \ s 
LO — 
ost 2 ae 7s 
Energy (keV) Energy (keV) 
e) North—West1 f) North—West2 
Fi, ; | 
EY TA nl ; 
[ | i mir r \, | i 
= “ r @ | Mi 
3 = fy oe 
a 2 oN 4 ~My , 
s es pha ial ql | ti iy: 
Ss ea Cf wil Nuty | 
: ft | [Whe 
8 8 \ Na 
. = \ . 
nm — 
oF \ a : 
f — \ ie fl : jn 
05 1 2 2 
Energy (keV) Energy (keV) 
Extended Data Figure 4 | X-ray spectra of the filaments. a-f, XMM- are shown here for clarity. The coloured lines show fitted contributions 
Newton/EPIC-pn spectra for the regions shown in Extended Data Fig. 2. from the source (red), the NXB (blue), the CXB (green), the Galactic halo 


The background region (a) refers to Offset1. The fitting procedure was 


(cyan), and the local hot bubble (magenta). 
performed jointly on all EPIC instruments; however, only the pn spectra 
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Extended Data Figure 5 | Radial mass profiles in the filaments and in the cluster. Shown are surface mass density profiles obtained from combined strong 
and weak lensing. The black curve shows the cluster average, compared to the profiles obtained in the direction of the filaments (same as Extended Data Fig. 1). 
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Extended Data Table 1 | Properties of the X-ray background in the Abell 2744 region 

Region CXB Halo kT Halo Norm LB Norm 
Offset 1 (6.26+0.56) x 10-7 0.297+0.024 (4.45+0.60) x 10-7 (1.89+0.25) x 10-° 
Offset 2 (7.0340.71) x 10-7 0.368+0.095 (2.31+0.91)x10~7 (2.36+0.36) x 10~° 
Offset 3 (6.92+0.78) x 10-7 0.31140.034 (5.05+0.88)x10-7 (2.14+0.36) x 10~° 
Offset 4 (7.65-+0.71) x 1077 0.28340.036 (3.52+0.82)x10-7 (2.40+0.28) x 10-° 


Comparison of X-ray background parameters per square arcminute obtained in regions Offset 1, 2, 3 and 4 (see Extended Data Fig. 1). CXB, cosmic X-ray background, in photons keV-! cm~? s~! at 
1 keV; Halo kT, in keV; Halo Norm, halo normalization, LB Norm, local bubble normalization, both as f NenydV x 10° 4/(4nd2(1 +2)), where da indicates the angular diameter distance at redshift z. 
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Extended Data Table 2 | Mass of the filaments 


Region Mur S/N Mxsp S/N 
[Az Mo] [Az9 Mol 

E (7.942.8)x10% 3.1 (4.443.1)x10° 2.1 

S (9.5+2.4)x10!3 68 (4.042.4)x10¥% 2.3 

SW _(4.8+1.7)x 103 3.1 (2.241.6)x10¥% 2.8 

NW1 (9.5+2.7)x104% 5.2 (6.9+3.0)x10% 2.2 

NW2 (1.2+0.3)x 10! 3.3 (2.2+1.0)x10% 2.6 
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Comparison of weak-lensing masses for the filaments for the two methods used here: the grid-based multi-scale approach (hybrid LENSTOOL, HLT, giving Muir) and the direct inversion 
method (KSB, giving Mxsg). 
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Relativistic baryonic jets from an ultraluminous 


supersoft X-ray source 


Ji-Feng Liu!*, Yu Bai!, Song Wang!, Stephen Justham'”, You-Jun Lu!’, Wei-Min Gu?, Qing-Zhong Liu’, Rosanne Di Stefano’, 
Jin-Cheng Guo!, Antonio Cabrera-Lavers®’, Pedro Alvarez®’, Yi Cao® & Shri Kulkarni® 


The formation of relativistic jets by an accreting compact object 
is one of the fundamental mysteries of astrophysics. Although 
the theory is poorly understood, observations of relativistic jets 
from systems known as microquasars (compact binary stars)!” 
have led to a well established phenomenology*”. Relativistic 
jets are not expected to be produced by sources with soft or 
supersoft X-ray spectra, although two such systems are known 
to produce relatively low-velocity bipolar outflows**®. Here we 
report the optical spectra of an ultraluminous supersoft X-ray 
source (ULS”*) in the nearby galaxy M81 (M81 ULS-1; refs 9, 10). 
Unexpectedly, the spectra show blueshifted, broad Ha emission 
lines, characteristic of baryonic jets with relativistic speeds. These 
time-variable emission lines have projected velocities of about 
17 per cent of the speed of light, and seem to be similar to those 
from the prototype microquasar SS 433 (refs 11, 12). Such relativistic 
jets are not expected to be launched from white dwarfs}, and an 
origin from a black hole or a neutron star is hard to reconcile with 
the persistence of M81 ULS-1’s soft X-rays’!°. Thus the unexpected 
presence of relativistic jets in a ULS challenges canonical theories 
of jet formation*“, but might be explained by a long-speculated, 
supercritically accreting black hole with optically thick outflows'*”°. 


Initial spectroscopic observations”! of M81 ULS-1, made at the W. M. 
Keck Observatory in 2010, found broad Balmer hydrogen emission 
lines (as wide as 400 km s~') on top of a power-law-like blue con- 
tinuum. A very broad emission line (as wide as 30 A, corresponding 
to 2,000 km s~‘) was detected at around 5,532 A and 5,543 A in both 
observations, but was not identified with any known spectral lines. We 
followed up with new spectra obtained at the Gran Telescopio Canarias 
in 2015, which again showed the Balmer emission lines and the blue 
continuum; however, the previously unidentified broad emission line 
was now at a notably changed wavelength of 5,648 A (Fig. 1). This 
change in observer-frame wavelength immediately suggests that the 
previously unidentified emission line is a blueshifted Ha emission line 
emitted by an approaching baryonic relativistic jet, at projected veloc- 
ities of 17% of the speed of light (—0.17c). Subsequent spectra reveal 
ongoing changes in the projected velocity of the blueshifted jet, for 
which we suggest that the best explanation is jet precession, as observed 
in the prototype microquasar SS 433. 

SS 433 has exhibited time-variable blueshifted and redshifted optical 
emission lines from its precessing jets, the long-term monitoring of 
which has revealed!” a precession period of 164 days, and an intrin- 
sic jet velocity of 0.26c. M81 ULS-1 is only the second microquasar 


Figure 1 | Spectra obtained from the W. M. 


Keck Observatory and the Gran Telescopio 


Canarias (GTC) for the optical counterpart of 
M81 ULS-1. a, The Keck/LRIS (Low Resolution 
| Imaging Spectrometer) spectrum taken on 

13 April 2010 (blue channel; shown in black) and 
the GTC/OSIRIS (Optical System for Imaging 

4 and Low/Intermediate-Resolution Integrated 
Spectroscopy) spectrum taken on 8 April 2015 
(shown in blue) for M81 ULS-1. Labelled are 

| the broad Balmer lines (Ha and Hs), and the 
very broad blueshifted Ha” lines at 5,530 A 

}  (Keck/LRIS) and 5,648 A (GTC/OSIRIS). The 
power-law-like continuum and the broad 
Balmer lines are characteristic of an accretion 

4 disk around a compact object, confirming the 
physical association between the X-ray source 
and its optical counterpart. b, The blueshifted 
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Ha’ emission lines from six Keck and GTC 
observations, with time-variable, observer-frame 
central wavelengths. The intensities also change 
with time in proportion to the intensities of the 
stationary Ha emission line from the accretion 
disk, suggesting a link between the accretion and 
the jet. See Methods for details. Both the spectra 
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and the fits are normalized by the underlying 
continuum, and are shifted vertically for clarity. 
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Figure 2 | Cumulative distributions of photon energies for M81 ULS-1. 
Shown are the energies in the low-flux and high-flux X-ray states. For 
comparison, we have also plotted the photon energies from two typical 
ultraluminous X-ray sources (ULXs) in nearby galaxies, and six Galactic 
microquasars observed with the Advanced CCD Imaging Spectrometer 
(ACIS) aboard the Chandra X-Ray Observatory. Most photons concentrate 
at energies where the cumulative distribution rises fastest. For M81 ULS-1, 
more than 95% of the photons have energies below 1 keV in both the 
low-flux state (black dashed line) and the high-flux state (black solid line). 
For both of the ULXs, fewer than 15% of the photons have energies below 
1 keV. For the Galactic microquasars, only a few per cent to less than 

35 per cent of the photons have energies below 1 keV. Because the response 
matrix is not well calibrated below 0.3 keV for Chandra/ACIS, only 
photons with energies greater than 0.3 keV are shown here. 


to be identified through measuring directly the blueshifting of Ha 
lines emitted by its baryonic jets. Other known microquasars!” have 
mostly been identified through direct imaging of their radio jets, or by 
interpreting strong non-thermal radio emission as arising from their 
relativistic jets with velocities above 0.1c. M81 ULS-1 has not previously 
been detected by radio surveys, but this is not surprising given the 
great distance to the galaxy M81. Were SS 433 placed in M81, its radio 
flux at Earth would be about 1 j.Jy—below the detection sensitivity of 
current radio facilities, but achievable in the future with the Square 
Kilometer Array”. 

From its X-ray properties'®, M81 ULS-1 seems to be a truly unique 
jet source, different to all other known microquasars!?. Since the 
launch of the Chandra X-Ray Observatory, all observations of M81 
have detected ULS-1, which exhibits high-flux and low-flux states with 
count rates ranging from 1 to 70 photons per kilosecond. When in 
high-flux states, M81 ULS-1 clearly exhibits supersoft spectra with 
blackbody temperatures of 65-100 eV and bolometric luminosities 
greater than 10°’ erg s~'. Somewhat surprisingly, the low-flux state 
of M81 ULS-1 appears to be as supersoft as the high-flux state, with 
more than 95% of the photons having energies below 1 keV (Fig. 2). In 
contrast, all other known microquasars’” are low-mass or high-mass 
X-ray binaries, each shown or thought to contain a neutron star or 
black hole, emitting abundant hard photons with energies above 1 keV. 
Observations by the Chandra X-Ray Observatory show that only a few 
per cent to 35 per cent of the photons from these microquasars have 
energies below 1 keV (Fig. 2). 

Luminous supersoft sources”* have supersoft X-ray spectra, and 
those with luminosities below the Eddington limit for a solar-mass 
object are conventionally interpreted as white dwarfs accreting at a 
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rate of about 10-7 Mayr! to 10-° Mz yr! (where M, is the mass of 
the Sun), where hydrogen fusion within the accreted material proceeds 
steadily”*>. But for M81 ULS-1, the presence of relativistic jets suggests 
otherwise: such jets are simply not expected for typical white dwarfs’’. 
Indeed, although bipolar outflows with low velocities of a few thousand 
kilometres per second are possible and have been observed in supersoft 
sources such as RX J0513.9-6951 (ref. 5) and RX J0019.8+2156 (ref. 6), 
no relativistic jets have ever, to our knowledge, been observed from 
supersoft sources other than M81 ULS-1. These considerations suggest 
that the accreting object in M81 ULS-1 is not a white dwarf, adding 
strong evidence to the idea**”’ that supersoft sources, especially the 
ultraluminous ones, do not necessarily contain accreting white dwarfs. 

If, instead, the central engine of M81 ULS-1 is a neutron star or 
a black hole—as is the case for all other known microquasars— 
established phenomenology* would predict steady jets to be gener- 
ated when X-ray emissions are in the low-hard state, with episodic 
jets generated when emissions are in the very high state, or during 
the transitions between soft and hard states. In the case of M81 ULS- 
1, the blueshifted Ha emission lines emitted from the relativistic 
jets were present in all six optical spectroscopic observations in 
2010 and 2015. Standard presumptions would therefore be that M81 
ULS-1 is in the low-hard or very high states for a substantial frac- 
tion of the time, during which abundant hard photons (with energies 
above 1 keV) would be expected, as for other microquasars (Fig. 2). 
However, X-ray emissions from ULS-1 have been supersoft in all 19 
Chandra observations, regardless of whether ULS-1 is displaying a 
low-flux or a high-flux state—suggesting that its relativistic jets are 
not generated in the canonical ways. In fact, the persistently supersoft 
appearance of ULS-1 would not be expected in any spectral states in 
the standard accretion scenarios”®”? for neutron-star or black-hole 
X-ray binaries, which are known to be accreting below the critical 
(that is, Eddington) rate. 

This unusual combination of relativistic jets and persistently super- 
soft X-ray spectra is completely unexpected, posing a challenge to the 
conventional understanding of jet formation*“. One possible identity 
for M81 ULS-1 is a long-speculated'*!°, supercritically accreting black 
hole with optically thick outflows. Recent magnetohydrodynamic 
simulations of such systems, although still under development and 
the subject of heated debate'®!”, can generate super-Eddington lumi- 
nosities, and necessarily*” generate disk winds and funnels along the 
rotation axis, from which radiation pressure will drive baryon-loaded 
relativistic jets with velocities of up to 0.3c, regardless of the black-hole 
spin’’. Observations of M81 ULS-1 qualitatively match the predic- 
tions of high luminosities and baryon-loaded relativistic jets, and its 
supersoft X-ray spectra might be expected from optically thick out- 
flows under suitable conditions of outflow geometry, wind velocities 
and outflow mass rates!*-”°. Thus, ULS-1 might be a manifestation of 
recent predictions of supercritical accretion onto black holes, and so 
reveal the nature of extreme accretion in extreme conditions. 


Online Content Methods, along with any additional Extended Data display items and 
Source Data, are available in the online version of the paper; references unique to 
these sections appear only in the online paper. 
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METHODS 

GTC/OSIRIS and Keck/LRIS data reduction. Initial optical spectroscopic obser- 
vations of M81 ULS-1 were carried out with Keck/LRIS on 13 April and 17 April 
2010, revealing broad Balmer emission lines as if from an accretion disk?!. The 
blueshifted Ha emission line (Ha”) is shown at 5,530 A in the spectra, with a 
shift in the line centre of 10 A+2A between those two observations (that is, two 
epochs separated by four nights). 

M81 ULS-1 was later observed using GTC/OSIRIS on 8 April, 7 May and 
8 May 2015, masked with the 0.6” slit followed by the R1000R grating, which yields 
a resolution of about 7 A. The spectra were reduced in a standard way with IRAF 
(Image Reduction and Analysis Facility) software (http://irafnoao.edu). After bias 
subtraction and flat correction, dispersion correction was carried out on the basis 
of the line lists given in the OSIRIS manual (http://www.gtc.iac.es/instruments/ 
osiris/). Raw spectra were then extracted with an aperture size of 1”, anda standard 
star taken at each night was used to make the flux calibration. 

On 22 April 2015, another observation of M81 ULS-1 was carried out 
using Keck/LRIS with the 1.0” slit. The light was split with a beam dichroic of 
6,800 A to the blue and red sides, followed by using the 300/5,000 and 400/8,500 
gratings, which yields a resolution of ~8 A. The spectrum was reduced with 
the IDL (Interactive Data Language) pipeline designed for the W. M. Keck 
Observatory. 

Extended Data Table 1 lists the basic information obtained from the 2010 and 
2015 observations. Both Ha and Ha” emission lines are detected in all the spectra, 
and their line properties are calculated from Gaussian line profile fitting. Extended 
Data Table 2 lists the central wavelength, the full width at half-maximum (FWHM) 
and the equivalent width for each fitted emission line. The observed Ha™ central 
wavelengths, A_, correspond to projected velocities, v,, from —0.17c to —0.14c in 


these observations, given by \_/ Ay = + aE . 
Vf 


Properties of the emission lines. In the case of SS 433, the equivalent widths of Ha 
emission lines are tightly correlated with the phases of the precession, and those of 
the Ha” emission follow a similar trend but with a phase delay*!. We use the power 
of the emission lines, calculated from the area of the Gaussian fitting, as represent- 
ative of the emission intensity, because the observed continuum from M81 ULS-1 
varies markedly between observations. Extended Data Fig. 1 shows that the power 
of the Ha emission lines from the accretion disk is positively correlated with that 
of the Ha” emission lines, suggesting a link between the accretion and the jet. The 
variations in the power of the emission lines are asymmetrical, with smooth rises 
and steeper declines around 7 May 2015 (Extended Data Fig. 2)—similar to the 
variations seen in the SS 433 emission lines*!. 

The rate at which the projected Ha” velocity changes seems to be slower during 
2015 than it was during 2010. The rate of change in 2015 was roughly 0.8 A per day, 
whereas it was 2.6 A per day in 2010; if the velocity shift is due to precession, this 
difference may be explained naturally, because the 2015 observations are sampling 
a different part of the precession cycle. We can estimate a minimum likely preces- 
sion period by assuming that the turning point of the precession cycle occurred 
at around the time of the observations of 7 and 8 May 2015 (see Extended Data 
Figs 2, 3). If so, then, after the wavelength of the emission lines reached the max- 
imum on 8 May (Extended Data Fig. 2), the Ha emission probably turned back 
to the short wavelength with the rate of roughly 0.8 A per day, indicating that the 
half-precession period must be longer than 30 days. 

There is a 115-A gap between the observations of 13 April 2010 and 8 April 
2015 (Extended Data Fig. 3), and if we assume that the maximum rate at which 
the wavelength decreases is 2.6 A per day, then the Ha~ line needs 44 days to 
move by the required amount. Therefore, the half-precession period is probably 
longer than 74 days, and lower limit of the precession period is about 148 days. 
More time-resolved spectra are needed in order to derive an accurate period and 
to characterize further the apparent precession of the jets. 

Searching for the redshifted Ha emission lines. Given the existence of blueshifted 
Ha emission lines from the approaching jets, redshifted Ha emission lines (Ha) 
would be expected from receding jets, albeit with much lower intensities (because 
of Doppler boosting effects). Assuming symmetrical and steady jets, the boosting 
factors (D) for the lines emitted from the approaching and the receding jets are 


if 
(1— 6?)2 
1+ 6 cosé 


i 
(1 — 67) 


pe ao ‘ tively. The total fl 
1—B cod <I respectively. e total flux 


given by D = >land D,= 


of a blueshifted or redshifted line in the observer frame is boosted by a factor of 
D®. The expected central wavelengths of the two lines are given by \_ = X9 / D_, 
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and \., = \y /D,, and the corresponding redshifts are z, = ., / A) — 1> 0 and 
z_=A_/X)—1<0. Dand z values for Ha’ in all observations are listed in 
Extended Data Table 3. 

We have searched for the redshifted Ha” emission lines in all observations. A 
weak emission line feature was detected at ~3c at around 7,524A (Fig. 1), roughly 
symmetrical to Ha” at 5,648 A, in one of the GTC exposures during the night of 
8 April 2015. If this marginal detection were the redshifted Ha” line, its boosting 
factor would be D, = Ay /X ,. = 0.8722, and the ratio of the received total flux of 
the blueshifted line to that of the redshifted line should be D? / D} =~ 2.5, which 
is roughly consistent with the observed flux (~47}'2). 


However, the observed wavelength is not consistent with the expected Ha” wave- 


1-6 cosé 


length given the blueshifted Ha” at 5,648 A, that is, 6,563 . Assuming 


(1 — 87) 
the extreme case, 6=0°, then we have G=0.1491. If the receding jet has the same 
velocity, then the expected central wavelength of Hat should be 7,626.4 A, which 
is about 104A larger than the detected line. If we assume that 6 = 10°/20°/30°, then 
G=0.152/0.160/0.177, and the expected central wavelength of Hat is 
7,632 A/7,650 A/7,688 A, which is about 108 A/126 A/164A larger than the detected 
line. The discrepancy becomes larger for larger inclination angles. 

This casts doubt on the identification of the 7,524-A line feature as Hat, unless 

the jets are asymmetrical or fast-changing. We may have not detected the redshifted 
Ha‘, but the non-detection is not surprising given the Doppler boosting effects, 
and other realistic explanations. For example, the receding jets may be blocked by 
the optically thick outflows if this system is a supercritically accreting black-hole 
system, as described in the text. No candidate Ha* emission lines were detected 
at all in the 7 and 8 May GTC observations, or in the Keck spectrum. Even if the 
8 April line were a true Ha* emission line, this non-detection would not be sur- 
prising, given the lower equivalent widths of Ha” on 7 and 8 May, and the relatively 
lower sensitivity in the red channel of LRIS. 
Analysis of Chandra data. There have been 19 Chandra/ACIS observations of the 
nuclear region of M81, where ULS-1 resides. All of these observations were derived 
from the Chandra archive and analysed uniformly with CIAO 4.7 software tools 
(http://cxc.harvard.edu/ciao/). Point sources were detected with WAVDETECT 
on the individual Chandra images. As listed in Extended Data Table 4, the photon 
counts were extracted from the source ellipses enclosing 95% of the total photons 
as reported by WAVDETECT, which was run with scales of 1”, 2”, 4” and 8” in 
the 0.3 to 8.0 keV band. 

The spectra in the high-flux states (>10 counts per kilosecond) were fitted by 
absorbed blackbody models, with the spectral parameters presented in Extended 
Data Table 4, all of which show that M81 ULS-1 has been persistently supersoft 
in these observations. In addition, the spectra in the high-flux and low-flux states 
were added together into combined high- and low-state spectra, and were also 
fitted in the band 0.3-8.0 keV. Using the fitted absorbed blackbody model, we 
calculated the 0.3-8.0 keV flux, the 0.3-8.0 keV luminosity and the bolometric 
luminosity with the distance of 3.63 megaparsecs for M81 (ref. 32). 

As plotted in Extended Data Fig. 4, M81 ULS-1 displays a soft excess below 

0.3 keV as compared to the best-fit model for 0.3-8.0 keV. However, considering 
that the response matrix for Chandra observations is not well calibrated below 
0.3 keV, we refrain from interpreting this soft excess. Nonetheless, it is clear that 
M81 ULS-1 has very different spectral properties from the other known micro- 
quasars. Moreover, these uncertainties in calibration below 0.3 keV might merely 
make the intrinsic spectral differences between M81 ULS-1 and the other known 
microquasars even larger (that is, the energy distribution from M81 ULS-1 might 
be even softer than observed). 
Code availability. The optical spectra were reduced with IRAF, available at http:// 
irafnoao.edu/. All of the emission lines in Extended Data Table 2 were fitted with 
the curve-fitting toolbox based on Matlab (http://www.mathworks.com/help/ 
curvefit/index.html). The Chandra archive data were analysed with CIAO 4.7, 
which can be downloaded from http://cxc.harvard.edu/ciao/download/. 


31. Vittone, A., Rusconi, L., Sedmak, G., Mammano, A. & Ciatti, F. Correlations and 
periodicities of equivalent widths in SS 433. Astron. Astrophys. (Suppl.) 53, 
109-117 (1983). 

32. Freedman, W. L. et al. The Hubble Space Telescope extragalactic distance scale 
key project. 1. The discovery of Cepheids and a new distance to M81. 
Astrophys. J. 427, 628-655 (1994). 


© 2015 Macmillan Publishers Limited. All rights reserved 


LETTER 


Extended Data Figure 1 | The power of Ha emission versus that of Ha emission for M81 ULS-1, in units of 107 °erg s~! cm~. The error bars 


20150422 


+ 


20150408 
ue 


20100413 


20150507 
17 


0.5 


1 1.5 
Ho 


denote 68.3% uncertainty. Observations are given as year followed by month followed by day. 


© 2015 Macmillan Publishers Limited. All rights reserved 


2 


LETTER 


1.8 


— 
N 
T 
i 


oS 
(oe) 
T 
L 


0.6 


0.4/7 : 


0.2 ! ! ! ! ! ! 
7120) 7125) 7130) «7135 «67140 «#7145 = =7150 = «(7155 
HJD-2450000 


Extended Data Figure 2 | Variation in the power of emission lines from M81 ULS-1, in units of 10~!° erg s~! cm~?. The error bars denote 68.3% 
uncertainty. The x-axis gives the observation date as a Heliocentric Julian Date (HyD). 
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Extended Data Figure 3 | The centre of Ha~ emission as a function of the relative observational date. The dates of observations are marked in the 
figure; the ‘relative observational date’ refers to the date relative to the first observation of 2010 or 2015. The error bars denote 68.3% uncertainty. 
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Extended Data Figure 4 | The M81 ULS-1 spectra and model fitting. The spectra from Chandra observation ID 735, the combined high-state 
observations, and the combined low-state observations are shown with red, blue, and black crosses respectively. The corresponding blackbody models in 
the energy range 0.3-8.0 keV are shown with red, blue, and black dotted lines. The yellow dashed line indicates photon energy of 0.3 keV. The error bars 
denote 68.3% uncertainty. 
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Extended Data Table 1 | Observations of M81 ULS-1 


Date Telescope Exposure Time A) 


(second) (A) 


2010.4.13 Keck 1000x3 5 
2010.4.17 Keck 1200x2 5 
2015.4.08 GTC 1800x3 7 
2015.4.22 Keck 2800 x2 8 
2015.5.07 GTC 18003 7 
2015.5.08 GTC 1800x4 7 


A indicates spectral resolution. 
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Extended Data Table 2 | Properties of Ha*/Ha~ and Ha for M81 ULS-1 


HJD-2450000 is Ha 


Center FWHM EW. Power Center FWHM Power 


5299.83 5932.3: 1:5 (—-O17e) 3324 4125 1532:0.20 65629240:2 1084205 1.69+0:09 
5303.77 5543.14 1.6(—0.17c) 3344 2443 0.9440.13 65628402 9440.3 1.30+40.05 
7121.47 5647.5 + 2.3 (—0.15c) 3246 2145 0.6540.15 65649403 6740.6 0.31+0.04 
7134.88 5683.0 + 3.0(—0.14c) 3447 3349 1.0740.31 65644409 8942.1 0.86+40.27 
7150.42 5696.0 + 3.1(—0.14c) 4648 1644 1.08+40.24 65641401 76403 1.60+0.09 
7151.46 5695.2 + 2.3 (—0.14c) 2946 1343 057+40.15 6564440.2 74405 0.79+0.06 
7121.47 7922.1 = 257 (40.146) 20246 2727 0.16220.05 


The centre, FWHM and equivalent width are in units of angstréms. The numbers in parentheses are velocities, in units of the speed of light (c). The power is in units of 10! erg s~! cm~®. All of the 
error bars denote 68.3% uncertainty. The bottom row shows Ha* emission; the other rows show Ha” emission. 
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Extended Data Table 3 | Doppler boost factors for each observation of M81 ULS-1 


Date A(H> or Hz’) D Zz 


20100413 5532.5 1.1862 -0.1570 


20100417 5543.1 1.1840 -0.1554 


20150408 5647.5 1.1621 -0.1395 


20150422 5683.0 1.1548 -0.1341 


20150507 5696.0 1.1522 -0.1321 


20150508 5695.2 1.1523 -0.1922 


20150408 7524.0 0.8722 0.1465 


The second column gives the wavelength of the blueshifted/redshifted Ha emission. D is the Doppler boost factor; z is the redshift. 
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Extended Data Table 4 | Chandra observations of M81 ULS-1 


ObsID Obs Date ExpT Cner Csop Count Rate kT, nH Flux Lx Dot x2 /dof State 
(ks) (countks~!) (eV) (107° cm?) (ergs s~' cm~!) (10% ergs s~! ) (10°* ergs s~ ) 

acis390 2000Mar21 2.4 140 25 60.814 5.74 145+40.9 12.5+20.0 1.42e-13 2.3 6.7 1.390/4 high 

acis735 2000 May 07 50.7 3679 2141 67.16+1.19 78+1.6 9.5+1.0 1.99e-13 3.1 23.1 1.086/41 high 
acis5935 2005 May 26 I1.1 11 2 1.06+0.43 low 
acis5936 2005 May 28 11.6 11 2 0.82 +0.40 low 
acis5937 2005 JunOl 12.2 22 6 1.63+0.46 low 
acis5938 2005 Jun03 12.0 485 185 37.694 1.87 8146.2 12.0+5.8 1.89e-13 3.0 25.6 2.041/17 high 
acis5939 2005 Jun06 12.0 364 181 27.534 1.61 7645.6 8.3442 1.62e-13 2.6 17.7 0.857/13 high 
acis5940 2005 Jun09 12.1 70 20 4.90 + 0.73 low 
acis5941 2005Jun!! 12.0 429 187 32.16+ 1.74 91+5.8 6.5+4.0 1.74e-13 2.8 10.6 1.004/17 high 
acis5942 2005 Jun15 12.1 405 206 30.304 1.68 7045.4 14.1+5.1 1.68e-13 ply | 42.5 0.724/14 high 
acis5943  2005Jun18 12.2 525 187 41.3041.94 91454  8.8243.8 2.10e-13 3.3 16.0 1.096/21 high 
acis5944 2005 Jun21 12.0 356 85 28.714 1.64 9649.8  20.6+8.6 1.16e-13 1.8 20.3 1.128/15 high 
acis5945 2005 Jun 24 11.7 415 220 32.044 1.75 65+4.4 19.8+5.3 1.69e-13 phy 97.4 1.028/14 high 
acis5946 2005 Jun 26 12.2 287 167 20.75+1.40 70+7.1 10.6+5.9 1.29e-13 2.0 22.5 1.160/8 high 
acis5947 2005Jun29 10.8 40 29 2.8140.62 low 
acis5948 2005Jul03 12.2 77 51 4.93+0.73 low 
acis5949 2005Jul06 12.2 54 33 3.36+0.62 low 
acis9805 2007 Dec2!1 5.2 44 28 7.55+ 1.43 low 
acis9122 2008 Feb0O! 10.0 55 18 5.40+0.87 low 
Total high 149.3 7085 3584 37.854 2.53 84+1.3 8.2+0.1 1.98e-13 3.1 16.9 1.739/31 
Total low 97.4 384 189 3.614081 8248.3 5.6411.3 2.03e-14 0.3 1.4 1.069/12 


Column 1 shows the observation identification number. Column 2 shows the observation date. Column 3 shows the on-time (exposure time, ExpT) without dead-time correction. Column 4 shows the 
net number of photon counts in the range 0.1 to 8.0 keV (Cyer). Column 5 shows counts in the supersoft band (Cso), 0.1-0.5 keV. Column 6 shows the count rate after vignetting correction. Column 7 
shows the temperature for the blackbody fit to the spectrum (kTpp) in the range 0.3 to 8.0 keV. Column 8 shows the neutral hydrogen column density (ny). Column 9 shows the 0.3-8.0 keV flux for the 
blackbody fit to the spectrum. Column 10 shows the luminosity (Lx) at 0.3-8.0 keV. Column 11 shows the unabsorbed bolometric luminosity (Lbo1). Column 12 shows the reduced X? and degree of 
freedom (dof) for the spectral fit. Column 13 shows whether the observations indicate a high-flux or a low-flux state (10 counts per kilosecond separate these two states). 
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Ab initio alpha-alpha scattering 


Serdar Elhatisari!, Dean Lee*, Gautam Rupak?, Evgeny Epelbaum‘, Hermann Krebs‘, Timo A. Lahde®, Thomas Luu! & 


Ulf-G. Meifgner!>:® 


Processes such as the scattering of alpha particles (*He), the 
triple-alpha reaction, and alpha capture play a major role in 
stellar nucleosynthesis. In particular, alpha capture on carbon 
determines the ratio of carbon to oxygen during helium burning, 
and affects subsequent carbon, neon, oxygen, and silicon burning 
stages. It also substantially affects models of thermonuclear type Ia 
supernovae, owing to carbon detonation in accreting carbon-oxygen 
white-dwarf stars!~>. In these reactions, the accurate calculation 
of the elastic scattering of alpha particles and alpha-like nuclei— 
nuclei with even and equal numbers of protons and neutrons—is 
important for understanding background and resonant scattering 
contributions. First-principles calculations of processes involving 
alpha particles and alpha-like nuclei have so far been impractical, 
owing to the exponential growth of the number of computational 
operations with the number of particles. Here we describe an 
ab initio calculation of alpha-alpha scattering that uses lattice Monte 
Carlo simulations. We use lattice effective field theory to describe 
the low-energy interactions of protons and neutrons, and apply a 
technique called the ‘adiabatic projection method’ to reduce the 
eight-body system to a two-cluster system. We take advantage of 
the computational efficiency and the more favourable scaling with 
system size of auxiliary-field Monte Carlo simulations to compute 
an ab initio effective Hamiltonian for the two clusters. We find 
promising agreement between lattice results and experimental phase 
shifts for s-wave and d-wave scattering. The approximately quadratic 
scaling of computational operations with particle number suggests 
that it should be possible to compute alpha scattering and capture 
on carbon and oxygen in the near future. The methods described 
here can be applied to ultracold atomic few-body systems as well 
as to hadronic systems using lattice quantum chromodynamics to 
describe the interactions of quarks and gluons. 

In recent years there has been much progress in ab initio scattering 
and reactions involving light*° and medium-mass”* nuclei. However, 
for most numerical methods, the number of computational operations 
increases markedly when the projectile nucleus has more than a few 
nucleons. Therefore it remains a challenge to study many important 
processes that are relevant for stellar astrophysics such as alpha—alpha 
scattering, alpha-carbon scattering and radiative capture, as well as car- 
bon and oxygen burning in massive star evolution and thermonuclear 
supernovae’. 

We describe lattice calculations for which the number of compu- 
tational (floating point) operations for the Aj-body + A2-body prob- 
lem scales as roughly (A; + A2)*; this scaling is mild enough to make 
first-principles calculations of alpha processes possible. We use the 
formalism of lattice effective field theory!” ? (EFT) and a technique 
for elastic scattering and inelastic reactions on the lattice called the 
‘adiabatic projection method’!*""”, 

Chiral EFT is a framework for organizing the low-energy nuclear 
interactions of protons and neutrons according to powers of momenta 
and factors of the mass of the pion; see ref. 18 for a review of the theory. 


The important interactions are at leading order (LO), the next largest 
contributions are at next-to-leading order (NLO), and then follows 
next-to-next-to-leading order (NNLO). We present an ab initio calcula- 
tion of “He + “He scattering going up to NNLO terms in chiral EFT. We 
find promising agreement with experimental data!? * for the s-wave 
and d-wave phase shifts; improvements can be achieved by including 
higher-order terms in the chiral expansion. 

The adiabatic projection method addresses the cluster-cluster scat- 
tering problem on the lattice by using Euclidean time projection to 
construct an effective two-cluster Hamiltonian. By Euclidean time pro- 
jection we mean multiplication by exp(—Hr), where H is the underly- 
ing microscopic Hamiltonian and 7 is Euclidean time. We use natural 
units, where the reduced Planck constant h and the speed of light c 
are set to one. Even though the actual lattice calculations use discrete 
time steps, we refer to the continuous Euclidean time parameter 7 for 
notational simplicity. 

Our starting point is a three-dimensional spatial lattice that is peri- 
odic with length L in each dimension. We take a set of initial two-alpha 
states |R), labelled by their separation vector R, as illustrated in Fig. 1. 
We take the initial alpha wavefunctions to be Gaussian wave packets, 
so that at large separations they factorize as a tensor product of two 
individual alpha clusters: 


IR) =)0 Ir+R), @Ir), 


r 


where r is a summation variable corresponding to the location of the 
second cluster. The summation over r produces two-alpha states with 
total momentum equal to zero. Rather than dealing with a large array 
of three-dimensional vectors R, we project onto spherical harmonics 
Yee, with angular momentum quantum numbers ¢, ¢,: 


IR) = x. Yoo (Ror rilR’) 
R 


where 6 is the Kronecker delta function. We only consider cases where 
R=|R| <L/2. 

On the lattice, the symmetry group of spatial rotations is broken down 
to a cubic subgroup. Nevertheless, at low scattering energies, this approx- 
imate rotational symmetry is very accurate, provided that artefacts due 
to the periodic volume are removed. We remove these artefacts using a 
hard spherical wall boundary; the spherical harmonic projection tech- 
nique is useful for extracting data for selected partial waves. This method 
has been extended to particles with spin and partial wave mixing, and 
shows excellent agreement with continuous-space calculations”’. 

We use Euclidean time projection to form dressed cluster states: 


IR) = exp(—Hr)[R) 


The evolution in Euclidean time automatically incorporates the 
induced deformation and polarization of the alpha clusters as they 
approach each other. The deformation and polarization are due to the 
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Figure 1 | Initial state clusters. Initial state|R) composed of two alpha- 
particle wave packets on the lattice separated by the displacement vector R. 
Each alpha-particle wave packet consists of four nucleons. Protons are red; 
neutrons are blue; spins are represented as arrows. 


interactions of individual nucleons between the two alpha clusters, as 
well as to repulsion as a result of the Pauli exclusion principle for iden- 
tical fermions. 

With these dressed cluster states, we compute matrix elements of the 
full microscopic Hamiltonian with respect to the dressed cluster states: 
Ll, —_ be, Le, 

[AR _ 2(R|H|R’), (1) 
Because the dressed cluster states are not orthogonal, we construct a 
norm matrix: 


IN Tea = FRIR YS 
The radial adiabatic Hamiltonian is defined as a matrix product: 
aybl, __ pry—1/2 -1/260, 
[Hr leg =IN, HLN, Jee’ (2) 


In the limit of large projection time 7, the spectrum of the adiabatic 
Hamiltonian reproduces the low-energy finite-volume spectrum of the 
microscopic Hamiltonian H. In ref. 17, it is shown that in the asymp- 
totic region where the alpha clusters are widely separated, the adiabatic 
Hamiltonian reduces to a simple two-cluster Hamiltonian with only 
infinite-range interactions such as the Coulomb interaction between 
the otherwise non-interacting clusters. Although this may seem an 
obvious result, it is a non-trivial statement that the dependence on the 
projection time 7 drops out from the adiabatic Hamiltonian at large 
distances. 

We study “He + *He scattering using the same lattice action that is 


used to study the Hoyle state of !*C (ref. 11). The spatial lattice spacing 
is a= 1.97 fm and the Euclidean-time, or temporal, lattice spacing is 
a;= 1.32 fm. Revisiting these calculations in the future with different 
lattice spacings and including higher-order terms in the chiral expan- 
sion will provide a useful measure of systematic errors in lattice calcu- 
lations of larger nuclear systems. 

We perform projection Monte Carlo simulations with auxiliary fields 


to compute the matrices[H,] ee and[N te ona periodic cubic lattice 


with volume L* = (16 fm)’; see ref. 24 for an overview of methods used 
in lattice EFT. The total projection time for the initial and final dressed 
cluster states together is 27, which is equal to the product of the number 
of time steps L; and the temporal lattice spacing a;. We determine 


bl, 
[N, Jo, 


from calculations with L; time steps and[H_] ed from calcula- 
tions with L;+ 1 time steps. The extra time step for [H she is needed 
to calculate the matrix elements of H in equation (1). For these calcu- 
lations, a new algorithm is used to allow for Monte Carlo updates of the 
auxiliary fields as well as updates of the alpha cluster positions. 

We compute the radial adiabatic Hamiltonian using equation (2) and 


extend it to a much larger volume of (120 fm)?. This is done by 
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computing matrix elements of [H? hes at large separation (large R 


and R’) from single-alpha lattice simulations, and then including the 
Coulomb interaction between the otherwise non-interacting clusters. 
This process also allows us to define a ‘trivial’ two-cluster Hamiltonian 
in which the two alpha clusters are non-interacting except for the 
infinite-range Coulomb interaction. 

With the radial adiabatic Hamiltonian defined in the large (120 fm) 
box, we extract the scattering phase shifts by imposing a hard spherical 
wall boundary at some radius Ryai and determining the standing wave 
modes. In Fig. 2 we show s-wave radial functions for two different 
radial excitations (2s and 3s) at NNLO using chiral EFT. The error bars 
show 1-standard deviation (s.d.) Monte Carlo errors calculated using 
a jackknife analysis of the lattice data. We could extract the phase shift 
by fitting to the asymptotic behaviour of the radial wavefunction as in 
ref. 17; however, it is more accurate to extract the phase shifts from the 
energy of the standing wave, as discussed in ref. 25. 

Figure 3 shows the phase shifts for s-wave scattering versus labo- 
ratory energy at LO, NLO, and NNLO in chiral EFT, compared with 
experimental data!®-””. The green dashed (LO), blue short-dashed 
(NLO), and red solid lines (NNLO) are determined from fits to the 
lattice data using the effective range expansion (see Methods). For 
further comparison, the inset of Fig. 3 shows NLO results using 
halo EFT with point-like alpha particles*®. Halo EFT is an effec- 
tive theory in which clusters of tightly bound nucleons are treated 
as point particles. Our LO results do not include Coulomb effects 
and so have substantially different behaviour near the alpha—alpha 
scattering threshold. The NLO and NNLO phase shifts are quite 
similar, and both agree fairly well with the experimental data. The 
close agreement between NLO and NNLO results is probably acci- 
dental: several contributions appearing at NNLO seem to cancel 
each other out. The same does not occur for the d-wave phase shifts. 
The results and error bars shown in Fig. 3 are computed from lat- 
tice phase-shift data for L;= 4 to L;= 10 and extrapolating to the 
limit L;— oo. Details of the extrapolation fit and all associated 
error estimates are discussed in Methods. The observed energy of 
the s-wave resonance in the centre-of-mass frame is 0.09184 MeV 
above threshold. For the lattice results, we find that the ground 
state is 0.79(9) MeV below threshold at LO, and 0.11(1) MeV below 
threshold at both NLO and NNLO (the errors in parentheses here 
and elsewhere represent 1 s.d.). 


1 25 T T 
2s state S 


3s state S 


r (fm) 


Figure 2 | s-wave scattering radial wavefunctions. The second-lowest- 
energy (red squares) and third-lowest-energy (blue circles) s-wave radial 
wavefunctions for spherical wall radius Rwai + 36 fm (grey dashed line) at 
NNLO plotted versus radial distance. The dashed and double-dot-dashed 
lines show the fits to a Coulomb wavefunction for the second and third 
radial states, respectively. The error bars indicate 1-s.d. Monte Carlo errors 
calculated using a jackknife analysis of the lattice data. 
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Figure 3 | s-wave phase shifts. s-wave phase shifts 69 at LO (green 


triangles), NLO (blue circles), and NNLO (red squares) versus laboratory 
energy Ey), compared with experimental data!?-** (black asterisks). The 
theoretical error bars indicate 1 s.d. uncertainty due to Monte Carlo errors 
and the extrapolation of that data to infinite projection time. The green 
dashed (LO), blue short-dashed (NLO), and red solid (NNLO) lines are 
determined from fits to the lattice data using the effective range expansion. 
The black dot-dashed line in the inset shows NLO results using halo EFT 
with point-like alpha particles°. 


In Fig. 4 we show phase shifts for d-wave scattering versus laboratory 
energy at LO, NLO and NNLO, compared with experimental data!?-??, 
The green dashed (LO), blue short-dashed (NLO), and red solid lines 
(NNLO) are determined from fits to the lattice data using the effective 
range expansion. Although there are differences, the NNLO results 
agree fairly with the experimental results. As in the s-wave case, we 
show the extrapolated values and errors in the limit L;— oo, using 
lattice data for L;= 4 to L;= 10. Details of the extrapolation fit and all 
associated error estimates are discussed in Methods. We determined 
the centre-of-mass energy and the decay width of the d-wave reso- 
nance of the phase shift data from ref. 22 to be Eg = 2.92(18) MeV and 
I= 1.34(50) MeV, respectively. Owing to the large decay width, there 
is some model dependence in the definitions of the resonance param- 
eters; we discuss several different definitions and determinations in 
Methods. At LO we find Ep = 1.10(12) MeV and I’= 0.32(10) MeV, 
at NLO Ep = 3.84(16) MeV and I’= 3.22(21) MeV, and at NNLO 
Eg =3.27(12) MeV and "= 2.09(16) MeV. 

To summarize, we present an ab initio calculation of 4He + 4He scat- 
tering. We use lattice EFT and the adiabatic projection method to com- 
pute phase shifts for s-wave and d-wave scattering up to NNLO, and 
find promising agreement with experimental data. To perform these 
calculations, we used spherical wave projections of the lattice initial 
states and a new algorithm that performs updates of both the auxiliary 
field configurations and alpha cluster positions. A schematic of the 
method is given in Extended Data Fig. 1. 

Perhaps the most notable outcome of this study is a numerical 
method for simulating scattering and reactions that has a very favour- 
able scaling with particle number. The number of computational oper- 
ations needed for the A)-body + Az-body problem scales roughly as 
(A, + A2)’ for light and medium-mass nuclei, and the algorithm does 
not require the projectile to be very light. Because sign oscillations 
are greatly suppressed for alpha-like nuclei!””’, our approach appears 
to be a viable method for studying important processes such as alpha 
scattering and capture on ?C. Direct experimental data for alpha cap- 
ture on 'C is not possible, owing to Coulomb barrier suppression at 
energies relevant for stellar nucleosynthesis, and extrapolations from 
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higher energies have uncertainties that exceed the 10% accuracy needed 
for stellar evolution models. 

Nevertheless, there has been progress in measuring the contribu- 
tion from subthreshold states”*® and cumulative R-matrix analyses 
using multiple data sources such as beta-delayed alpha-decay of '°N 


and “He + 'C elastic scattering”. Ab initio lattice calculations can con- 
tribute to these efforts by calculating asymptotic normalization coeffi- 
cients for subthreshold states, determining the direct capture rate onto 
the ground state, and providing low-energy data on “He + °C elastic 
scattering. For these future calculations, we expect that about four times 
as much computing time as the roughly two million core hours used 
for this work will be required; the computational resources available 
appear sufficient to keep stochastic errors under control. To reduce 
systematic errors, we are currently working on including lattice nuclear 
forces at the next-higher order in the chiral expansion, reducing the 
lattice spacing, improving the lattice action, and doing precision tests 
of systematic errors in the adiabatic projection method. If necessary, 
the ab initio lattice results will be further improved by including short- 
range operators in the adiabatic Hamiltonian to make fine adjustments 
to the energies of near-threshold states of !°O. 

There is an obvious overlap between lattice calculations using the 
adiabatic projection method and halo EFT. Therefore it might be 
fruitful to look for synergies between the two methods. In cases where 
there is a large scale separation between the low-energy scattering and 
high-energy internal excitations, benchmark tests can be made between 
halo EFT and lattice calculations. Furthermore, ab initio calculations 
can be used to determine input data for halo EFT, as done in ref. 30. 
In cases where the separation of scales is not large, lattice calculations 
can be used to guide improvement of halo EFT to include nuclear 
core excitations. It also might be useful to treat the lattice adiabatic 
Hamiltonian as a halo EFT for clusters, and explore extensions to three- 
and four-cluster systems. This method could potentially be used to 
investigate multi-alpha-cluster structures in '?C and '*O. 

It would be exciting to extend the methods presented here to lat- 
tice quantum chromodynamics (QCD) and construct adiabatic 
Hamiltonians for hadronic systems. All of the techniques used in our 
lattice simulations have immediate analogues in lattice QCD. The initial 
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Figure 4 | d-wave phase shifts. d-wave phase shifts 5, at LO (green 
triangles), NLO (blue circles), and NNLO (red squares) versus laboratory 
energy E;,), compared with experimental data!*-? (black asterisks). The 
theoretical error bars indicate 1 s.d. uncertainty due to Monte Carlo errors 
and the extrapolation of that data to infinite projection time. The green 
dashed (LO), blue short-dashed (NLO), and red solid (NNLO) lines are 
determined from fits to the lattice data using the effective range expansion. 
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cluster wavefunctions and stochastic sampling of cluster positions can 
be implemented using interpolating sources in lattice QCD. The lat- 
tice spherical harmonic projections introduced here and extended in 
ref. 23 should be useful for the treatment of angular momentum and 
partial waves in lattice QCD and constructing a hadronic adiabatic 
Hamiltonian. 


Online Content Methods, along with any additional Extended Data display items and 
Source Data, are available in the online version of the paper; references unique to 
these sections appear only in the online paper. 
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METHODS 


Auxiliary field formalism. We simulate the interactions of nucleons on the lattice 
using projection Monte Carlo with auxiliary fields; see ref. 24 for an overview of 
methods used in lattice EFT. The details of the lattice action used in the calculations 
can be found in refs 11 and 31. We use an auxiliary-field formalism where the 
interactions among nucleons are replaced by interactions of nucleons with auxil- 
iary fields at every lattice point in space and time**-**. This follows from an exact 
Gaussian integral identity connecting the exponential of the two-particle density 
p” to the integrated exponential of the one-particle density p: 


C 3 | 1 ee lo 
ks - d ze. [= 
eso ae Ab { = exp| = +1—Csp (3) 


where Cis the interaction coefficient and s is an auxiliary variable. In the auxiliary- 
field formalism each nucleon evolves as if it is a single particle in a fluctuating 
background of auxiliary fields. We use a total of sixteen auxiliary fields at LO in 
the chiral expansion coupled to the total nucleon density, three spin densities, 
three isospin densities, and nine spin-isospin densities. Similarly, the pion fields 
function much like auxiliary fields and generate the one-pion exchange potential. 
The interactions are reproduced by integrating over the auxiliary fields and pion 
fields. We use a spatial lattice spacing a= (100 MeV) = 1.97 fm and temporal 
lattice spacing a; = (150 MeV)! = 1.32 fm. 

For any fixed initial and final state, the amplitude for a given configuration of 
pion and auxiliary fields is proportional to the determinant of an A x A matrix My. 
The entries of Mj are the single nucleon amplitudes for a nucleon starting at state 
jat7=0 and ending at state i at 7 = 77. Formally, the calculation proceeds as fol- 
lows. Let|Wp, p,) be an antisymmetrized product of single-nucleon states compris- 
ing two alpha clusters centred as Gaussian wave packets at each location R; and 
R. Let Hyo denote the LO Hamiltonian that includes instantaneous one-pion 
exchange and contact interactions. Let Hsyi4) be an approximation to Hi that has 
an underlying SU(4) symmetry among the nucleons that eliminates sign oscilla- 
tions’”*>. This SU(4) symmetry, discussed in ref. 36, refers to a symmetry group 
where the four nucleon states (proton spin-up, proton spin-down, neutron spin-up, 
neutron spin-down) can be interchanged with each other. This is a good starting 
point because the low-energy nuclear interactions are approximately SU(4) sym- 
metric. The Coulomb interaction and the one-pion exchange interaction are cer- 
tainly not SU(4) symmetric; however, the important short- and medium-range 
parts of the s-wave nucleon-nucleon interactions appear to respect this symmetry 
rather well*”. There are arguments from quantum chromodynamics in the limit of 
a large number of colours that explain how this spin and flavour symmetry can 
arise**°. There is also empirical evidence that some predictions of SU(4) symme- 
try are rather well satisfied by the spectrum of light nuclei*, and lattice QCD 
simulations have shown that SU(4) symmetry becomes even more accurate for 
heavier quark masses“). 

Let us define a trial wavefunction: 


[rir (7')) = exp(— Agyya) 7’) |r, R,) 


We are using exp(—Hgu,4yT’) a8 an approximate low-energy filter that is compu- 
tationally inexpensive. With this trial wavefunction we compute the amplitude: 


(Pp,r, (7) |exp(—2H{07)| Da r,(7')) 


We write 27 rather than 7 because the total projection time for the initial and 
final dressed cluster states together is 27. Higher-order contributions, Coulomb 
interactions, and isospin-breaking effects are computed as perturbative corrections 
to this amplitude. The auxiliary fields are updated using a non-local updating 
algorithm called hybrid Monte Carlo, while the coordinates of the alpha clusters 
in the initial and final states are updated using the Metropolis algorithm with 
random local updates. 

Hybrid Monte Carlo algorithm. The hybrid Monte Carlo algorithm’? is an 
efficient method for generating non-local updates of the auxiliary fields and 
pion fields. For simplicity we denote s as one of the sixteen possible auxiliary 
fields or three pion fields and discuss the updating algorithm for this field. The 
hybrid Monte Carlo algorithm was first introduced in lattice QCD, in which 
the size of the matrix is proportional to the number of space-time lattice points. 
In our case the matrix Mj is an A x A matrix, where A is the number of nucleons. 
In general terms, the algorithm is described by means of a probability weight and a 
molecular-dynamics Hamiltonian: 


Z(R3,R4; Ry, Ry3 27) = 


P(s) xexp[—V(s)],_ H(s,p) == 5 3 [p(m,n,)P + V(s) 


nny 


V(s) is, in general, a non-local function of the field s(n, n;), where n denotes the spa- 
tial lattice site and 7; is the number of time steps; p(n, n,) denotes the momentum 
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conjugate to s(n, m;). In our case P(s) is a product of terms arising from the quad- 
ratic action for s (given in equation (3)) and the absolute value of the determinant 
of M,(s). 

Given an arbitrary initial configuration s(n, n,), the conjugate momentum is 
chosen from a random Gaussian distribution: 


1 2 
P[p° n,n,)| «x exp}—— e n,n 
[p (1 1,)] °| aay (1,n;)] (4) 
The Hamiltonian equations of motion are then integrated numerically with a finite 


step size Estep. We begin with a half-step forward in the conjugate momentum: 


P°(n, 1) 


= Pp (n,m) 


0 
=s 
followed by repeated updates of s and p: 


s(n, 1) = '(1 1h) + Estep (1, 1,)» 


OV(s) 
Os(n, n,) : 


i441 = 
a (n, nm) =p'(n, 7) Estep 


P 


_,itl 
for a specified number of steps Ngep. The remaining half-step backward in p is 
given by: 

OV(s) 
Os(n, n;) 


N, 


p(n, nm) 


Estep 


= p(n, m,) + 


ss Nstep 


The evolved configuration is then subjected to a ‘Metropolis test’ using a random 
number r€ [0, 1). The condition: 


Nutep, pNotep ) +4 H(s°, p° J 


determines whether the new configuration is accepted or rejected. If the Metropolis 
test is passed, then both s and p are updated, otherwise the original s is retained 
and only p is refreshed, according to equation (4). 

Calculation of adiabatic projection matrix elements. Along with each hybrid 
Monte Carlo update, we allow for Metropolis updates of the locations of the Gaussian 
wave packets in the initial state, R,, Ro and the final state, R3, Rs. This Metropolis 
update is accepted or rejected on the basis of the absolute value of the amplitude 
Z(R3, Rg; Ry, Ro; 27) for the new wave packet postions, We perform projection 


r<exp[—H(s 


Monte Carlo simulations to compute the matrices [H, ay dR, R and [N. Bie ona periodic 
cubic lattice with volume L? = (16 fm)? for partial waves @ = 0 and ¢ = 2. mae 
simulations are performed for the s-wave and the d-wave. From the: matrices [H dee 
and[N_ yo" Rae We compute the radial adiabatic Hamiltonian [H? 1%" RE 

We nies values of the radial parameters R and R’ ranging up to L/3. But we 
extend the radial Hamiltonian to much larger values of R and R’ by computing a 
‘trivial’ radial adiabatic Hamiltonian from single-alpha-cluster simulations and 
including the infinite-range Coulomb interaction explicitly. Let |Wp,) be an anti- 
symmetrized product of single-nucleon states comprising a single alpha cluster. 
As is done for the two-cluster simulations, we define a trial wavefunction: 


|Wp,(7’)) = exp(—Hgyayr’) [Yp,) 
With this trial wavefunction, we compute the LO one-cluster amplitude: 
(Yp,(7") exp(—2H oT) |%a,(7")) 


In the limit where R, and R; are widely separated from R, and Ry, we can factorize 
the LO two-cluster amplitude as a product of one-cluster amplitudes: 


Z(R3; R327) = 


Z(R3,R4; Ry, Ry; 27) > Z(R33 Ry; 2T)Z(R,; R327) 


In this manner we determine the trivial radial adiabatic Hamiltonian for large 
separation between the clusters. We use the trivial radial adiabatic Hamiltonian to 
extend [H? i ag “sto volumes as large as L? = (120 fm)*. This trivial radial adiabatic 
Fiowiltoetin, does not take into account effects such as the antisymmetrization of 
identical nucleons from different clusters; however, such effects are expected to be 
negligible because we use the trivial Hamiltonian only at large distances where the 
clusters do not overlap. We performed simulations at a somewhat larger volume, 
13 = (20fm)°, and estimate that the error due to lack of full antisymmetrization of 
the trivial Hamiltonian results in errors smaller than a couple per cent for the phase 
shifts. 

The NLO and ne corrections are included using first-order perturbation 
for the matrix [H_,]°“ rR? which yields a corresponding correction to the radial adi- 


abatic Hamiltonian, [H? sag w Even though we treat higher-order corrections to the 
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radial adiabatic Hamiltonian using perturbation theory, the scattering states of the 
radial adiabatic Hamiltonian are determined by imposing a hard spherical wall at 
radius Rya and using sparse eigenvector methods to determine the standing 
waves”, Therefore, at NLO and higher, we recover important non-perturbative 
Coulomb interactions near threshold. More detailed discussions of the infrared 
enhancement of Coulomb effects and strategies for power counting in EFT are 
found in refs 46 and 47. 

The Coulomb interaction is included using perturbation theory in the lat- 
tice Monte Carlo simulations because a non-pertubative treatment to all orders 
would produce sign oscillations that diminish the quality of the data. For these 
calculations, however, there is no need to treat the Coulomb interaction non- 
perturbatively in the Monte Carlo simulations. The correction to the alpha 
binding energy due to the Coulomb interaction is less than 10%, and so high- 
er-order Coulomb effects are at the level of 1% or smaller. The Coulomb interac- 
tion becomes important only at long distances, owing to the different asymptotic 
properties of Coulomb wavefunctions and spherical Bessel functions. This differ- 
ence between Coulomb wavefunctions and spherical Bessel functions is properly 
handled in our calculations by the non-perturbative treatment of the Coulomb 
interaction in the adiabatic Hamiltonian. 

For alpha-like nuclei, with even and equal numbers of protons and neutrons, 

sign oscillations are suppressed, owing to the approximate SU(4) symmetry of 
the low-energy interactions!*””*>°, For very light nuclei, the computing time 
is dominated by the calculation of the single nucleon amplitudes, which scales 
linearly with the number of nucleons A. For very heavy nuclei, the computational 
time is mostly consumed by calculations of matrix determinants, which scale as 
A’. For light to medium-mass nuclei, the scaling of the number of computational 
operations is roughly A’, which is between A and A*. The simulations were run 
with up to 32,768 cores on the JUQUEEN Blue Gene/Q supercomputer at the 
Jiilich Supercomputing Centre. Roughly two million core hours were needed for 
the calculations reported here. 
Effective range expansion. The Coulomb interaction does not appear in the LO 
term in the chiral expansion because we consider the electric charge e as a small 
parameter. There are alternative schemes that take into account enhancement of 
the Coulomb interaction near threshold“*“”. We use the effective range expansion 
to draw the green dashed lines through the LO data points in Figs 3 and 4. For 
partial wave ¢, we fit the generalized scattering length a, effective range r,, and 
shape parameter P, using: 


p!*1 cot{d(p)] = — 2 + dep? + Ppt +. 
a 2 (5) 
where p is the relative momentum. 

For the NLO and NNLO results, we use the Coulomb-modified effective range 
expansions to draw the blue short-dashed (NLO) and red solid (NNLO) lines in 
Figs 3 and 4. We now briefly outline the elements of the Coulomb-modified effec- 
tive range expansion here. Let us define the Coulomb parameter: 


Y= 2H gy ZZ 


where ;u equals the reduced mass of the two-alpha system, apy ¥ 1/137 is the 
electromagnetic fine-structure constant, and Z, = Z, = 2 are the charges of the two 
alpha particles. The factor C, ; for partial wave ¢ is defined as: 


2e 


e 
2 2 2 2 pd 
) = ————_,C, s+ 
nl (2+) !P nol ( ) 
where 
2 _ 207 
Cio i em = 


and = 7/(2p). The Coulomb-modified effective range expansion**-* is then: 


1 1 
Ce cot[6,(p)] + yhy(p) =-— —+=rp* + Ppt +: 
a 2 (6) 


where: 


2 
i(p) =p" h(n), Wn) =Rel en) — tog 
7,0 
and 4(z) = I’(z)/I(z), in which the prime indicates differentiation. 

We used the effective range expansion (equation (5)) and Coulomb-modified 
effective range expansion (equation (6)) to extract s-wave and d-wave energy 
levels. Although there is no problem extracting energy levels from the lattice 
data, the experimentally determined alpha-alpha s-wave phase shifts are not of 


sufficient quality near threshold to accurately determine the s-wave resonance 
parameters, and so they must be determined from other reactions. The Triangle 
Universities Nuclear Laboratory (TUNL) nuclear data evaluation gives the values”! 
Ep=0.09184MeV and P= 5.57(25) eV. 

The d-wave resonance extraction must be treated with some caution because 
the large decay width results in considerable model dependence, owing to different 
definitions of the resonance parameters and different fits to experimental data*™°?. 
Here we use the definition that the resonance energy Ex is given by the location 
of the maximum of d6/dE, and the decay width I’ is determined by the value of 
2(dé/dE)~| at Ex (ref. 54). Using this definition, the d-wave phase shifts given in 
ref. 22 yield Ep = 2.92(18) MeV and J’"= 1.34(50) MeV. Using the set of phase shifts 
provided in ref. 55, we obtain Ep = 2.88(6) MeV and I= 1.52(3) MeV. The TUNL 
nuclear data evaluation reports values of Eg = 3.12(1) MeV and I’= 1.513(15) MeV. 
Nuclear forces at NNLO. In the chiral nuclear EFT used here, the interactions 
among nucleons are organized according to their importance on the basis of a 
systematic expansion in powers of Q/A, with the ‘hard scale’ of the nuclear inter- 
actions A © 1 GeV. The ‘soft scale’ Q is associated with nucleon spatial momenta 
and the pion mass M,. The dominant contributions to the nuclear Hamiltonian 
appear at O((Q /A)°) (LO); the NLO terms are 0((Q /A)’) and involve only the 
two-nucleon force. In the results presented here, contributions to the nuclear 
Hamiltonian are taken into account up to O((Q/A)’) (NNLO). These terms 
include the three-nucleon force, which first appears at NNLO. The electromagnetic 
force, which is important in nuclear binding, is also included consistently and 
systematically (for details, see ref. 31). At LO and NLO, we have two and seven 
low-energy constants, respectively. These constants are determined from a fit to 
neutron-proton scattering data. In addition, two low-energy constants parame- 
terize the breaking of isospin symmetry of the strong nuclear force and are fixed 
from the proton-proton and neutron-neutron scattering lengths. Isospin symme- 
try refers to the equivalence of protons and neutrons and is an approximate sym- 
metry of the nuclear interactions. There are also two low-energy constants 
parameterizing the three-nucleon force that are determined from the triton bind- 
ing energy and the axial vector current contribution to triton decay”®. 

Data extrapolation and error analysis. We use the lattice Monte Carlo data to deter- 
mine the radial adiabatic Hamiltonian for the s-wave and d-wave channels for L;= 4 
to L;= 10. We then compute the s-wave and d-wave phase shifts with errors calcu- 
lated using a jackknife analysis of the Monte Carlo data. In Extended Data Figs 2 
and 3 we show NNLO results for the s-wave and d-wave phase shifts, respectively. 

The dot-dashed lines in Extended Data Figs 2 and 3 indicate the fitted expo- 
nential curves that are used to extrapolate to the limit L;— oo. This is done by 
including the residual dependence from an excited state at energy AE above the 
ground-state energy. We use the ansatz: 


6y(Ly> E) = 69(E) + co(E)exp[ -—AE pL, a,] 


for the s-wave and: 


6,(L,,E) = 5,(E) + ¢,(E)exp[—AE,L,a,] 


for the d-wave, where co(E) and c(E) are fit parameters; see ref. 17 for a discus- 
sion of the asymptotic time dependence of the adiabatic projection method. The 
dependence on L; is caused by a residual contamination due to excited states other 
than alpha-alpha scattering states. We expect the convergence in L; to be quite 
fast because there is a rather large energy gap between these excited states and the 
alpha-alpha scattering threshold. This fast convergence is seen in Extended Data 
Figs 2 and 3. The hatched bands show the one standard deviation errors of the 
extrapolation fits, including the propagated Monte Carlo errors of the data points. 
Dependence on lattice spacing. The simulations reported here were performed on 
a coarse lattice with a lattice spacing a= 1.97 fm. This raises the question of whether 
there are sizeable lattice artefacts related to this spacing. We are currently working 
on simulations of lattice chiral EFT with smaller lattice spacings and improved lat- 
tice actions, which remove lattice artefacts to higher orders. Studies of lattice spac- 
ing dependence for the two-nucleon system*’ and for the alpha-alpha system with 
point-like alpha particles°**? found that observables that probe low-energy phe- 
nomena are largely independent of the lattice spacing in the interval a= 0.5-2 fm; 
strategies to further reduce lattice artefacts are discussed in refs 58 and 59. Hence, 
we expect little dependence on lattice spacing for low-energy scattering between 
two alpha particles, provided that the binding energy and structure of the alpha 
particles are well reproduced, as they are for the lattice action used here. 

Future prospects for lattice simulations in nuclear physics. Lattice simulations 
have been increasingly important for the development of ab initio nuclear theory. 
Starting from a theory for quarks and gluons, lattice QCD has enabled promising 
steps towards calculating the interactions of nucleons and very light nuclei**!, 
as well as of magnetic moments” and radiative capture“. Starting from a theory for 
protons and neutrons, lattice EFT has enabled promising steps towards calculating 
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the structure of light and medium-mass nuclei!®!»**4 and neutron matter 


and is now allowing a move towards calculating nuclear scattering and reactions. 
The two methods complement one another, with lattice QCD providing input data 
for lattice EFT, and lattice EFT providing a computationally efficient method of 
simulating systems with many nucleons. The transfer of data can be realized by 
tuning operator coefficients in lattice EFT to match finite-volume energy levels of 
few-nucleon systems computed in lattice QCD simulations. Future calculations 
could start from quark masses and the electromagnetic fine-structure constant in 
lattice QCD, and end with reactions involving medium-mass nuclei in lattice EFT. 
Additional references. Other work on halo EFT includes studies on 7Li (ref. 67), 
19C (ref. 68), “He (ref. 69), and 7Be (ref. 70). 

Code availability. All codes used in this work are freely available from the authors 
on request. 
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eight-body system two-cluster system scattering waves 


Extended Data Figure 1 | Schematic overview of our method. We start lattice Monte Carlo simulations to construct the adiabatic Hamiltonian for 
with an eight-body system of protons and neutrons. Each alpha-particle two alpha clusters (grey spheres). Finally, we use the adiabatic Hamiltonian 
wave packet consists of four nucleons. The protons are red, the neutrons to compute alpha-alpha scattering phase shifts. 


are blue, and the spins are represented by arrows. Next, we perfom ab initio 
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Extended Data Figure 2 | s-wave extrapolations at NNLO. a~g, NNLO 1 s.d. uncertainty due to Monte Carlo errors. The dot-dashed lines are fits 

results (circles) for the s-wave phase shift 69 versus L, at laboratory energy to the data, used to extrapolate the L,— 00 limits. The red hatched regions 

E,ab = 1.00 MeV, 2.00 MeV, 3.00 MeV, 6.96 MeV, 8.87 MeV, 10.88 MeV, indicate the 1 s.d. error estimate of the extrapolation. 

and 12.30 MeV, respectively, as labelled. The theoretical error bars indicate 
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Extended Data Figure 3 | d-wave extrapolations at NNLO. a~g, NNLO 
results (circles) for the d-wave phase shift 5, versus L; at laboratory energy 
EL ab = 2.00 MeV, 3.00 MeV, 5.26 MeV, 6.96 MeV, 8.87 MeV, 9.88 MeV, 

and 10.88 MeV, respectively, as labelled. The theoretical error bars indicate 
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to the data, used to extrapolate the L;— oo limits. The red hatched regions 
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Potential sea-level rise from Antarctic ice-sheet 
instability constrained by observations 


Catherine Ritz!**, Tamsin L. Edwards***, Gaél Durand!?, Antony J. Payne’, Vincent Peyaud!? & Richard C. A. Hindmarsh? 


Large parts of the Antarctic ice sheet lying on bedrock below sea 
level may be vulnerable to marine-ice-sheet instability (MISI)!, a 
self-sustaining retreat of the grounding line triggered by oceanic 
or atmospheric changes. There is growing evidence?‘ that MISI 
may be underway throughout the Amundsen Sea embayment 
(ASE), which contains ice equivalent to more than a metre of global 
sea-level rise. If triggered in other regions* °, the centennial to 
millennial contribution could be several metres. Physically plausible 
projections are challenging®: numerical models with sufficient 
spatial resolution to simulate grounding-line processes have been 
too computationally expensive”*'” to generate large ensembles for 
uncertainty assessment, and lower-resolution model projections" 
rely on parameterizations that are only loosely constrained by 
present day changes. Here we project that the Antarctic ice sheet 
will contribute up to 30 cm sea-level equivalent by 2100 and 72 cm 
by 2200 (95% quantiles) where the ASE dominates. Our process- 
based, statistical approach gives skewed and complex probability 
distributions (single mode, 10 cm, at 2100; two modes, 49 cm and 
6 cm, at 2200). The dependence of sliding on basal friction is a key 
unknown: nonlinear relationships favour higher contributions. 
Results are conditional on assessments of MISI risk on the basis of 
projected triggers under the climate scenario A1B (ref. 9), although 
sensitivity to these is limited by theoretical and topographical 
constraints on the rate and extent of ice loss. We find that 
contributions are restricted by a combination of these constraints, 
calibration with success in simulating observed ASE losses, and 
low assessed risk in some basins. Our assessment suggests that 
upper-bound estimates from low-resolution models and physical 
arguments? (up to a metre by 2100 and around one and a half by 
2200) are implausible under current understanding of physical 
mechanisms and potential triggers. 

It is not yet clear? whether human-induced climate change has 
influenced the circulation of warm Circumpolar Deep Water driv- 
ing grounding-line retreat* of Pine Island Glacier, Thwaites Glacier 
and other glaciers in the ASE, or how this circulation might change 
in future’. However, grounding-line retreat under MISI is proposed 
to occur at a rate more or less independent of the original trigger and 
may continue even if that trigger diminishes”. MISI can be limited 
by buttressing from ice shelves or specific configurations of bedrock 
topography’”” and possibly also higher friction at the bed”!*"*. It has 
been suggested that grounding-line retreat could continue in the ASE 
for decades” to centuries** owing to weak topographical constraints, 
possibly slowed in Pine Island Glacier by a region of higher friction 
behind the grounding line”!*'4, MISI could be triggered elsewhere by 
ice-shelf collapse and/or exposure of further ice shelves to Circumpolar 
Deep Water, both of which are projected in some regions®” under the 
Special Report on Emissions Scenarios (SRES) A1B climate scenario’. 
Here we aim to quantify the dynamic contribution of the Antarctic ice 
sheet to sea level in the event of MISI under A1B. 


We take a statistical-physical approach, using a numerical ice-sheet 
model'* supplemented by statistical modelling of the probability of 
MISI onset. The statistical modelling represents the ocean and atmos- 
pheric drivers of MISI and the response of ice shelves, which are poorly 
known owing to the modelling challenges described earlier. We assign 
probabilities of MISI onset as a function of time until 2200 in each of 
11 sectors (Extended Data Fig. 1a) using expert synthesis of observed 
grounding-line retreat and thinning*'®'” and projected ice-shelf 
basal®'® and surface’ melting under A1B. 

The response of the grounding-line position to MISI onset is rep- 
resented with a new parameterization: if a MISI trigger occurs in a 
sector, the potential rate of retreat is a function of the basal friction 
coefficient at each part of the current grounding line (Extended Data 
Fig. 2c-e), with the form of the dependence (Extended Data Fig. 1b) 
based on theoretical considerations'. Grounding-line response is 
modified by two ice dynamical conditions that allow retreat to occur 
only if bedrock is downsloping from the margin (but allowing retreat 
over small bumps) and only at a rate not exceeding the theoretical 
limit’. The response is also modified by the basal friction law—the 
relationship between basal friction and sliding velocity—which has 
three possible configurations in this study: linear-viscous, nonlinear 
Weertman, or plastic flow. 

To assess modelling uncertainties, we generated a 3,000-member 
ensemble sampling MISI onset dates in the 11 sectors, 3 parameters 
governing retreat rate, bedrock topography, and the form of the basal 
friction law. We weighted the ensemble members in a Bayesian statis- 
tical framework with the difference between simulated and observed 
mass losses in the ASE (the only region where grounding-line retreat 
has been observed) to obtain calibrated projections. Details and pro- 
jections are in Supplementary Information. 

Observational calibration gives greatest weight to the ensemble 
members that most successfully simulate present day ASE mass loss. 
The expected mass trend from 1992 to 2011 is —59.0+ 13.5 Gt yr}, 
where the standard deviation is dominated by a conservative toler- 
ance for model error (Supplementary Information, section 1.7). The 
range of simulated mass trends is —13.4 to —218.3Gt yr~ |, with 39% of 
the ensemble more than three standard deviations from the expected 
trend, of which nearly all simulate losses that are too large. Parameter 
values that generate the most rapid and widespread present day retreat 
in the ASE are thus effectively ruled out. These also tend to give the 
highest sea-level projections, so calibration decreases projected quan- 
tiles. Medians at 2100 and 2200 decrease by 33% and 20%, and 95% 
quantiles by 36% and 30%, respectively; the modes, however, increase, 
particularly at 2200 owing to a shift in density from one local mode 
to the other. 

Spatial patterns of the probability of ungrounding (Fig. 1) show how 
local bed elevation, slope and friction strongly modulate the response 
to MISI onset. We find that the region with the highest probability 
of ungrounding and sea-level contribution is the ASE, owing to the 
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Figure 1 | Projected grounding-line retreat. a, b, Probability density 
estimates of grounding-line retreat at 2100 (a) and 2200 (b), overlaid on 
bedrock topography”. Red lines show 0.05 contour: an estimated 95% 
probability that retreat will be less extensive than this. c, d, ASE with Pine 
Island (PIG) and Thwaites glaciers. 


combination of topography (downsloping bedrock below sea level) and 
low friction (Extended Data Fig. 2c-e). Our 95% quantiles for the ASE 
are 25cm at 2100 and 48 cm at 2200 (all values are sea-level equivalent 
and, unless specified otherwise, 95% quantiles). The Thwaites region, 
which includes the Smith and Kohler glaciers“, contributes the greater 
part of this: 58% at 2100 and 53% at 2200. This is partly due to the 
basin definition, but is also due to relatively rapid and substantial thin- 
ning of Thwaites upstream of the grounding line (see Supplementary 
Video 1). The Peninsula and Marie Byrd Land hardly respond, 
despite being assigned the same probabilities of onset as the ASE 
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Figure 2 | Projected sea-level rise. a, Quantiles of Antarctic dynamic 
mass losses in cm sea-level equivalent as a function of time. b, Probability 
densities at 2100 and 2200. c, Probabilities of exceeding particular 
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(owing to observed grounding-line retreat and thinning*'*!”), because 
their bedrock is largely above sea level. 

Although basin contributions depend partly on coastline length, sim- 
ilar topographical limits are seen elsewhere: on the basis of projected 
ice-shelf surface and basal melting”’’, Princess Elizabeth Land and 
MacRobertson Land are assigned substantial probabilities of MISI but 
contribute only 1 cm by 2200, while Dronning Maud Land is assigned 
lower probabilities but contributes up to 4cm by 2100 and 8 cm by 2200. 
Responses also vary across the three basins of the Ronne-Filchner sec- 
tor, which are assigned identical onset dates on the basis of projected 
Circumpolar Deep Water intrusion®. Ellsworth shows widespread 
ungrounding, with the 95% quantile at 2200 approximately delin- 
eating a previously deglaciated region’? (Fig. 1 and Extended Data 
Fig. 3a), and contributes 9 cm by 2200; Shackleton Range and Pensacola 
Mountains show much less retreat and contribute 6cm and 4cm, 
respectively. 

For Totten Glacier in Wilkes Land, our results suggest that if cur- 
rent dynamic thinning is MISI driven by Circumpolar Deep Water’, 
the region has some potential for ungrounding (up to 5cm by 2200). 
The Siple Coast is assigned a small probability from ice-shelf basal 
melting’® but, when triggered, ungrounding is widespread owing to 
low basal friction (Extended Data Fig. 2c); we estimate that the total 
risk is small (up to 3 cm by 2200). These constraints are not absolute 
bounds—¢greater deglaciation has occurred in the past over longer time 
scales°—but appear to limit the amount of ice that can be lost in two 
centuries. Extended Data Figure 4 illustrates the effects of the two ice 
dynamical conditions, for example in George V Land, which is thought 
to be vulnerable in the long term® (Supplementary Information, 
section 2.2.1). 

The total continental contribution to sea level is relatively low in the 
first century and accelerates in the second (Fig. 2a), although a second 
mode emerges at 6 cm by 2200 (Fig. 2b). The probability of exceeding 
10cm rises rapidly this century to 57% at 2100; for exceeding half a 
metre, it reaches only 33% at 2200 (Fig. 2c, d). 

We find that the rate of sea-level rise from the ASE could be substan- 
tial this century: up to 1.3 mmyr! by 2050 and 2.1 mm yr7! by 2100 
(Fig. 3). However, many simulations stop (near zero mode at 2100 and 
local mode at 2200; Fig. 3b) or slow their retreat, particularly those with 
a linear-viscous friction law, so the 95% quantile at 2200 (1.1 mmyr~') 
is half that at 2100. Narrow zones of higher friction (hard bedrock) 
situated a few tens of kilometres upstream impede further retreat 
(Extended Data Fig. 3b). Extended Data Figure 5 shows this and other 
threshold behaviour dependent on friction law. 

The strong dependence of ASE response on basal friction law lies 
behind the bimodal projections for Antarctica at 2200 (Extended 
Data Fig. 6). Projections of MISI using one friction law”?!° may 
systematically under- or overestimate sea-level rise and will almost 
certainly underestimate its uncertainty. Although the sensitivity of 
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Figure 3 | Projected rate of sea-level rise from the Amundsen Sea. 
a, Quantiles of the rate of ASE dynamic mass losses in mm yr’ sea-level 
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grounding-line migration to friction law has been explored pre- 
viously*!*4, a fully Bayesian approach allows us to quantify the 
probabilistic contribution to uncertainty in sea-level rise. Extensive 
observations of basal type and hydrology, and better theoretical under- 
standing of basal hydrology and sliding, would be needed to reduce 
this uncertainty. 

Sensitivity to onset probabilities is limited for most basins by 
glaciological constraints that slow or stop retreat (Supplementary 
Information, section 2.2.2). Altering retreat onset probabilities by 
+ 20% changes basin 95% quantiles at 2200 by up to about 1 cm, 
and using early or late ASE onset dates (2000-2010 or 2020-2030) 
changes the 95% quantile at 2200 by less than 2cm (Extended Data 
Fig. 9a). Only Shackleton, Siple Coast and Transantarctic Mountains 
(Extended Data Fig. 9b-d) approach a linear response; increasing 
Siple Coast onset probabilities tenfold increases the 95% quantile at 
2200 by 8cm. 

Observational calibration reduces projected quantiles by constrain- 
ing the maximum rate of retreat and the regions over which this can 
occur (Extended Data Figs 7 and 8), mainly in the ASE. It presupposes 
that the best parameter values in one region are the best everywhere 
(although not the sliding law, which is not calibrated because it var- 
ies spatially; Supplementary Information, section 1.7). To assess the 
effect of this, we estimate that calibrating only the ASE contribution 
would increase 95% quantiles by approximately 6 cm (22%) at 2100 
and 21 cm (29%) at 2200. Results are robust to other calibration choices 
(95% quantiles at 2200 vary by a few centimetres; Supplementary 
Information, section 2.2.4). 

Our results are consistent with regional high-resolution model pro- 
jections. In particular, projected ice losses by 2200 under A1B driven by 
one of the ocean simulations on which we base our onset probabilities'® 
lie within our uncertainty estimates for the ASE (19-30% quantiles), 
Ronne-Filchner (Ellsworth, Pensacola Mountains, Shackleton: 56-65% 
quantiles) and Ross basins (Siple Coast, Transantarctic Mountains: 
90%; tenfold Siple Coast probabilities 80%). For Marie Byrd Land, the 
high-resolution projections are lower than our ensemble, but the con- 
tribution to our result is less than a centimetre. Projected rates for Pine 
Island and Thwaites glaciers are also consistent with high-resolution 
modelling under idealized basal melting scenarios, and continental 
totals with a statistically based projection assuming ASE collapse in 
2012 and linear growth of ice discharge elsewhere” (Supplementary 
Information, section 2.1). 

Our projections are essentially incompatible with upper-bound esti- 
mates for MISI®?! of around 50-80cm by 2100 and 140cm by 2200 
derived from physical arguments, extrapolation or low-resolution 
numerical models, and around 1 m by 2100 (95% quantile) from expert 
elicitation”’. Half a metre of sea level rise by 2100 is not exceeded at 
the 99.9% quantile (uncalibrated: 98% percentile). Contributions of 
around | metre by 2100 were obtained (Extended Data Fig. 10 and 


LETTER 


Bans 2100 
w & 25 — 2200 
on 
5 £ 15 
£5 1 
& § 05 
p= 
So 0 
2150 2200 0 05 10 15 
Density 
d= 100 
— 2100 
80 — 2200 


Probability of 
exceedance (%) 
L 
oO 


oO 


0051152 253 
Rate of SLE change (mm yr) 


2150 2200 


and 2200. c, Probabilities of exceeding particular thresholds as a function 
of time. d, Probability of exceeding any threshold at 2100 and 2200. 


Supplementary Information, section 2.2.3) by setting the parameter 
values to maximize ice loss and additionally either violating the theo- 
retical limit or triggering immediate MISI everywhere (in 2000 for the 
Peninsula, ASE and Marie Byrd Land; 2020 elsewhere), but we do not 
consider these realistic. One metre by 2200 is exceeded at the 99.9% 
quantile (uncalibrated: 95% percentile). 

We therefore find that MISI in the ASE could drive large and rapid 
sea-level rise but that the total Antarctic contribution is moderated 
by important physical constraints. Large uncertainties remain, in 
particular basal friction and its evolution, and further observations 
of surface and grounding-line changes would improve initialization 
and calibration. Future advances (high-resolution simulation of the 
ice-sheet-ice-shelf—ocean system; increased computational resources) 
will improve representation of the processes we parameterize and 
allow ensemble methods, while comparing multiple models would 
explore other representations of ice dynamics. But, given current 
understanding, our results indicate that plausible predictions of 
Antarctic ice-sheet instability leading to greater than around half a 
metre of sea-level rise by 2100 or twice that by 2200 would require 
new physical mechanisms”’, new projections of MISI triggers, 
or both. 
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Extended Data Figure 1 | Grounding-line retreat parameterization. 
a, Cumulative probability distributions of MISI onset for 14 basins 
(Fig. 1) aggregated into 11 independent sectors. b, Piecewise linear 
parameterization prescribing the dependence of grounding-line retreat 
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rate on the logarithm of the effective basal friction coefficient (Extended 
Data Fig. 2). Each of the 1,000 functional forms is a variant used in the 
ensemble; a subset are shown in bold as examples. See also Supplementary 
Information, sections 1.6.1, 1.6.2. 
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Extended Data Figure 2 | Initialization and basal friction evolution. 
a-c, Initial values of the difference between simulated and observed 
surface elevation (a); velocities averaged over ice thickness (b); the 
logarithm of the initial effective basal friction coefficient, a =logio((1’ 
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Extended Data Figure 3 | Projected grounding-line retreat and initial basal friction. a, b, Initial grounding line and map of a values (Extended Data 
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maximum sea-level contribution at 2200 (plastic sliding law): standard faster than the theoretical limit (c). See also Supplementary Information, 
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© 2015 Macmillan Publishers Limited. All rights reserved 


LETTER 


° 
© I 
a , ASE 2100 
I 
I 
BW 
I e 
1 soee° 
— 5 : eel, 
5 +4 ! e OP he 
~— I Po So 
~ ® aD . e 
5 ] ie é e . ie 
g id a : rae 
= mo _| ! ee e®, 
a oO ] sere ‘ e 
oe 1 e e e 
3 ! ot 
a 1 % 
o & 4 
i?) ; e Viscous 
! e Weertman 
1 e Plastic 
oO _| 1 A 
~ — Observations 
i --++ Observational uncertainty 
! --- Total uncertainty 
fo) 
0 50 100 150 200 250 
Present day rate of mass loss (Gt yr) 
° 
© I I 
b , i ASE 2200 
I I 
I I 
! ' ae eo ce ° 
I I 78 ad e ee 
eo 7 1 1 YES, +e = 
a I I & 4<¥s, ee oe ae 
5 : i% pee : Ce : ee % ae 
a I A le ALK 4 e ee 
5 ] fc = ah © i 7. ve 
ow | 4° ¢ he rs , oF 
S ele carne ée 
= <} | l - ave ee ae . e 
oO I ig 2 
= I oi | et 4 be Ae at 
g 7] g © sate Pd ob fo° 
co) e es Seee 
oO ! ‘si Sol .* gud e | Pies oe 
fob) e ° e e 
) oe Meant. a a. e Viscous 
sees ° e oe 7° e ° 
So 4 pan yee: he e Weertman 
ee SOP Oe my | e Plastic 
I a le Ate e I Fy 
ft At , 8 — Observations 
5 “e «. opptettdives torr ..++ Observational uncertainty 
: Lie .-+ Total uncertainty 
el eS SSO | 
0 50 100 150 200 250 


Present day rate of mass loss (Gt yr") 


Extended Data Figure 5 | Relationship between present and future 
sea-level contributions from the Amundsen Sea. a, b, Dynamic mass 
losses in cm sea-level equivalent from the ASE at 2100 (a) and 2200 (b), 
as a function of present day mass loss in the same region. The branches 


arise from interactions between basal drag coefficient and friction law that 
produce different rates of, and impediments to, grounding-line retreat. 
The observed mass loss is shown, along with observational (+ 305) and 
total (+ 30;) uncertainties (Supplementary Information, section 1.7). 
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Extended Data Figure 6 | Contributions of each basal friction law. a, b, Probability distributions of Antarctic dynamic mass losses in cm sea-level 
equivalent at 2100 (a) and 2200 (b) (as in Fig. 2b), showing the cumulative contributions of the basal friction laws. 
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Extended Data Figure 7 | Uncalibrated projections. a—h, Prior (uncalibrated) projections of Antarctic dynamic mass losses in cm sea-level equivalent 
(a-d); rate of ASE dynamic mass losses in mm yr! sea-level equivalent (SLE) (e-h). Posterior (calibrated) projections are in Figs 2 and 3. See also 
Supplementary Information, section 1.7. 
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Extended Data Figure 8 | Parameter calibration and influence. 

a, b, Weights for each of the 1,000 sub-ensemble parameter sets (averaged 
over basal friction laws) as a function of low threshold of effective basal 
drag coefficient (ajow) and maximum retreat rate (Vmax) (a); bedrock map 
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index and high threshold of effective basal drag coefficient (Cpigh) (b). 
Darker colours indicate values favoured by observational calibration. 

c, d, Uncalibrated dynamic mass losses at 2200 in cm sea-level equivalent 
(SLE) as functions of the same. 
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Death from drought in tropical forests is triggered 
by hydraulics not carbon starvation 


L. Rowland!, A. C. L. da Costa’, D. R. Galbraith’, R. S. Oliveira*, O. J. Binks!, A. A. R. Oliveira, A. M. Pullen®, C. E. Doughty°, 
D. B. Metcalfe’, S. S. Vasconcelos®, L. V. Ferreira®, Y. Malhi®, J. Grace!, M. Mencuccini!° & P. Meir! 


Drought threatens tropical rainforests over seasonal to decadal 
timescales'~*, but the drivers of tree mortality following drought 
remain poorly understood”. It has been suggested that reduced 
availability of non-structural carbohydrates (NSC) critically 
increases mortality risk through insufficient carbon supply to 
metabolism (‘carbon starvation’)”*®. However, little is known 
about how NSC stores are affected by drought, especially over the 
long term, and whether they are more important than hydraulic 
processes in determining drought-induced mortality. Using data 
from the world’s longest-running experimental drought study in 
tropical rainforest (in the Brazilian Amazon), we test whether 
carbon starvation or deterioration of the water-conducting 
pathways from soil to leaf trigger tree mortality. Biomass loss 
from mortality in the experimentally droughted forest increased 
substantially after >10 years of reduced soil moisture availability. 
The mortality signal was dominated by the death of large trees, 
which were at a much greater risk of hydraulic deterioration than 
smaller trees. However, we find no evidence that the droughted 
trees suffered carbon starvation, as their NSC concentrations were 
similar to those of non-droughted trees, and growth rates did not 
decline in either living or dying trees. Our results indicate that 
hydraulics, rather than carbon starvation, triggers tree death from 
drought in tropical rainforest. 

Drought-response observations from both field-scale experiments 
and natural droughts have demonstrated increased mortality over the 
short-term (1-3 years), with notably higher vulnerability for some 
taxa, and for larger trees’. After several years of drought, recovering 
growth rates in smaller trees, dbh (diameter at breast height) <40cm, 
and reduced mortality have been recorded at different locations®!!!”, 
However, the long-term (> 10 year) sensitivity of tropical forests to 
predicted prolonged and repeated water deficit! ? and the physiologi- 
cal mechanisms influencing this are poorly understood. Through-fall 
exclusion (TFE) studies, that create soil moisture deficit by the exclu- 
sion of a fraction of incoming rainfall, provide the only current means 
to assess the long-term response in mechanistic detail>?. 

Trees experiencing drought stress are thought to die from direct 
physiological failure and/or from injury and biotic attack associated 
with a decline in physiological vigour’. A global effort to identify 
the relevant physiological mechanisms triggering death and thus to 
improve predictions of forest tree mortality has focused on the twin 
possibilities of: (1) failure to supply sufficient carbon substrate to 
metabolism following drought-related reductions in photosynthesis 
and increased use of NSC, theoretically leading to carbon starvation; 
and (2) deterioration of the water-conducting xylem tissue, causing a 
rapid or gradual failure of key dependent processes (for example, gas 
exchange, photosynthesis, phloem transport), and potentially leading 
to tissue desiccation!*’>, ultimately leading to mortality. Despite recent 


intensive research, it is unclear how important these two mechanisms 
are in different biomes and how, or whether, to model them"®. 

Since 2002 a 50% TFE treatment has been implemented at a 
1 ha-scale drought experiment in old-growth forest at Caxiuana 
National Forest Reserve, Para State, Brazil®!”, to simulate maximum 
possible rainfall reductions predicted to occur in parts of Amazonia 
by 2100 (ref. 1). Mortality surveys, recruitment and growth rates of 
all trees > 10cm dbh, have been monitored through the experimental 
period (see Methods). Recently, seasonal data on NSC concentrations 
were measured on leaves, branches and stems of 41 trees (20 trees on the 
control, 21 trees on the TFE) of the most common genera in the exper- 
iment (Extended Data Table 1). Xylem vulnerability curves were also 
performed on the branches of these trees (see Methods). Here, we syn- 
thesize these data to test whether long-term soil moisture deficit alters 
NSC storage and use in tropical rainforest trees, and if this, or hydraulic 
processes, are most strongly associated with increased mortality rates. 

By 2014, following 13 years of the TFE treatment, cumulative bio- 
mass loss through mortality was 41.0 + 2.7% relative to pre-treatment 
values (Fig. 1a), and the rate of loss had increased substantially since 
the previous reported value of 17.2 + 0.8%, after 7 years of TFE®. 
Accelerating biomass loss and failure to recover substantially, or to reach 
a new equilibrium’, has led to a committed flux to the atmosphere 
from decomposing necromass of 101.9 + 19.1 Mg C ha™! (Fig. la). 
This biomass loss has been driven by elevated mortality in the largest 
trees (Fig. 1b), as previously observed over shorter timescales®, and 
has created a canopy that has had a persistently lower average leaf area 
index during 2010-2014 (12.0 + 1.2% lower; Extended Data Fig. 1). 

Remarkably, individual tree growth rates for the four years before 
death showed no significant reduction in either the TFE or control 
plots (Fig. 2a), indicating that growth is prioritised to the point of 
death irrespective of the soil moisture deficit treatment. From 2008, 
tree growth in every wet season (January-June) on the TFE treatment 
relative to the control was significantly elevated (P < 0.05) in the small 
and medium trees (up to 4.6 + 0.2 times higher in small trees, and 
2.9 + 0.2 times higher in medium trees), and maintained in the largest 
trees (10-20 cm, 20-40 cm and >40 cm dbh, respectively; Fig. 2b-d). 
Elevated wet season growth occurred despite 0.1-0.9 MPa reduction 
in average soil water potential (W,) at depths of 0-4m on the TFE and 
a loss of seasonality in U, (Extended Data Fig. 2). Increased growth 
in the small trees occurred from 2008 onwards, following earlier 
substantial mortality of large trees (Fig. 1), which generated canopy 
gaps. Increased light availability to smaller trees and, presumably, 
reduced below-ground competition for water and nutrients, allowed 
competitive release of trees on the TFE®, and elevated growth rates. 
Competitive release on the TFE implies that, following 13 years of 
drought-stress, photosynthetic production is sufficient not only to 
maintain growth in the largest trees (Fig. 2d), but to increase growth 
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Figure 1 | Changes in biomass and mortality rates. a, Biomass on the 
control and TFE plot from 2001-2014 (Mg C ha“! yr~'). Error bars show 
the s.e.m. calculated from 12 estimates of biomass for trees on the control 
plot (n= 369) and TFE (n = 358), accounting for uncertainty in wood 
density and allometric equations (see Methods). b, Mortality rate (% stems 
per year) for trees on the control plot (black) and TFE (grey) separated 

for trees of 10-20 cm dbh (control n= 164-193, TFE n= 132-174, with 
range showing 2001-2014 maximum and minimum n), 20-40 cm (control 
n= 97-105, TFE n= 81-104), dbh and >40 cm (control n = 41-45, TFE 
n= 17-37). The genus and date of death for each tree used in the mortality 
rate calculations is shown in Extended Data Table 3. 


in trees <40cm dbh (Fig. 2b, c). This response would not be possi- 
ble if the majority of trees were severely carbon limited, unless very 
considerable long-term (or renewed) carbon resources were being 
drawn upon. 


Prioritization of growth under drought in the TFE is consistent 
with recent observations following short-term drought in Amazonia’. 
However, the maintenance of NSC concentrations in the TFE treatment 
suggests that the prioritisation of growth during drought does not occur 
at the expense of depleted carbon stores, as previously hypothesized’. 
Neither the concentrations of soluble sugar (carbon immediately availa- 
ble to metabolism) nor starch (stored carbon which can be converted to 
sugars) were significantly depleted in stem, leaf and branch tissue from 
the TFE, relative to control (Fig. 3). The seasonal changes in both sugar 
and starch concentrations, which varied by 50-90%, were much larger 
than any differences associated with the TFE treatment (Fig. 3). Despite 
13 years of severely reduced soil moisture availability, the seasonal cycle 
and use of NSCs was unaltered, implying that the sampled trees did not 
draw significantly upon their NSC reserves to buffer against the long- 
term effects of soil moisture deficit. Large changes in carbon allocation 
from roots and leaves to maintain stem growth during drought!” have 
not been reported on the TFE™”. Similarly, no drought-induced reduc- 
tions in photosynthetic capacity occurred on the TFE'%, although how 
total canopy productivity is affected remains uncertain. Considering 
this and additional evidence of no increase in herbivore attack on the 
TFE (Extended Data Fig. 3), our results suggest progressive carbon 
starvation and biotic foliar consumption are not important drivers of 
the mortality patterns observed in the TFE forest following extended 
severe soil moisture deficit (>10 years). 

Deterioration of the water transport system in the xylem tissues fol- 
lowing drought can also lead to death’”"*. The vulnerability of the xylem 
to drought is described by a vulnerability curve”’, which relates water 
potential in xylem conduits to loss of hydraulic conductivity because of 
occlusions by gas emboli. The water potential at which 50% loss of xylem 
conductivity occurs (Ps9, MPa) is a commonly used index of embolism 
resistance”, We determined xylem Psy for the trees on the control and 
TFE plots, with tree dbh ranging from 15 to 48cm. A highly significant 
decrease in Ps) with dbh was found across TFE and control (Extended 
Data Table 2, P< 0.01). As dbh increased from 15 to 48 cm there was a 
1.3 + 0.2 MPa reduction in the Po value, with significant genus-to-genus 
differences (Fig. 4). Leaf water potential (Y)) could only be measured 
during limited sampling campaigns (2-3 days) that were characterized 
by low vapour pressure deficit (VPD, 54-59% of peak dry season values) 
and unseasonal rainfall in the preceding days. Differences between treat- 
ment and control WY were not detected. Mean midday WV recorded across 
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Figure 3 | Leaf, branch and stem NSC concentrations. a, b, Percentage of 
soluble sugars (a) and starch (b) in biomass of leaves, branches and stems, 
in the late dry season (November 2013), mid wet season (March 2014) and 
in the wet-to-dry transition (June 2014). Each value for leaves, branches 
or stems represents an average of samples taken from n= 20 trees on the 
control, and n= 21 trees on the TFE. TFE Error bars show s.e.m. and * 
indicates a significant difference at P < 0.05 using the Wilcoxon test. June 
2014 has significantly elevated sugar values on the TFE plot; however, 

the absolute values for sugar concentration are very low, and the absolute 
differences are very small. 


all the trees together with the vulnerability curves determined for each 
genus were used to predict the percentage loss of xylem conductivity 
(PLC) with dbh. Values of PLC at mean VY increased with dbh, with the 
largest diameter trees predicted to have reductions in conductive capac- 
ity of about 80% in some genera, indicating significant vulnerability to 
hydraulic deterioration (inset of Fig. 4). 

Given no evidence of carbon starvation and similar Yj across plots 
in the sample dates, why did many more trees die in TFE than control? 
The lack of treatment differences in VY contrasts starkly with the long- 
term records of lower W, (Extended Data Fig. 2). The lack of difference 
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in midday VY could have been caused by sampling constraints or by 
isohydric behaviour. We found evidence of non-isohydric behaviour 
in our diurnal YW, measurements (Extended Data Fig. 4), with over- 
all strong linear declines in VY observed with increasing VPD on the 
control (R? = 0.18, P< 0.01) and in particular on the TFE (R? = 0.33, 
P<0.01). Consequently, limited sampling is the most likely cause of 
equal Y, between the two plots, with TFE trees likely to be having more 
negative WV) and lower hydraulic conductance during VPD maxima in 
the dry season. Reduced carbon uptake because of stomatal closure 
in some TFE trees is possible”, but is unlikely to have caused carbon 
starvation considering that growth rates were maintained or elevated on 
the TFE (Fig. 2) and that radial growth should decline before photosyn- 
thesis in drought conditions’. Even with an isohydric response, trees 
on the TFE would still be likely to suffer greater hydraulic deterioration 
caused by greater PLC in the roots and main stem. Strongly reduced WV, 
on the TFE (Extended Data Fig. 2) and significant hydraulic vulnerabil- 
ity of the tall trees are consistent with the hypothesis of hydraulic deteri- 
oration as the most likely trigger of greater mortality, particularly in the 
largest trees, as observed. Why the xylem tissue of larger trees is more 
vulnerable to embolism deserves further study. Taller trees are predis- 
posed to greater hydraulic stress, from elevated atmospheric demand 
and longer hydraulic path lengths’. As the canopies are exposed to 
rainfall in the TFE, smaller trees could avoid hydraulic deterioration 
through leaf water uptake”**®, but this may not be sufficient to save the 
largest trees, which we hypothesize are forced to maintain their high 
growth rates until death to continually replace dysfunctional xylem. 
Following decadal-scale soil moisture depletion, our results suggest 
that tropical rainforests will experience accelerating biomass loss and 
a likely transition to a lower statured, lower biomass forest state, due 
to substantially elevated mortality of the largest trees. This mortality is 
most likely triggered by hydraulic processes, which lead to hydraulic 
deterioration and subsequent, potentially rapid, limitations in carbon 
uptake”’, instead of being caused directly by gradual carbon starva- 
tion. Under natural drought these forests may be under greater risk 
than from experimental drought, as severe soil moisture deficit is com- 
bined with low humidity and high air temperature, increasing hydraulic 
demand. Improved prediction of the sensitivity of tropical tree mortal- 
ity to drought should therefore focus on improved model simulation of 
plant hydraulics and modelling environmental controls on growth””””. 
Decadal-scale ecological data such as these are rare, but they are invalu- 
able for testing and improving predictions from vegetation models over 
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timescales that are relevant to climate change”®. They also underpin the 
long-term environmental policy needed to manage the natural capital 
that is embedded in tropical rainforests. 


Online Content Methods, along with any additional Extended Data display items and 
Source Data, are available in the online version of the paper; references unique to 
these sections appear only in the online paper. 
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METHODS 

Site. The through-fall exclusion (TFE) experiment is located in the Caxiuana 
National Forest Reserve in the eastern Amazon (1° 43’ S, 51° 27’ W), ~400km 
west of the nearest city, Belém, State of Para, Brazil. The experiment is located in 
terra firma forest, on yellow oxisol soils which are 75-83% sand, 12-19% clay and 
6-10% silt?®. The site is 15m above sea level, has a mean annual rainfall between 
2,000-2,500 mm and a pronounced dry season between June and November’. 

The experimental site has two 1-ha plots: the TFE, over which plastic panels 
and gutters have been placed at a height of 1-2 m, and which exclude 50% of the 
incident rainfall; and a corresponding control plot, <50m from the TFE, on which 
there has been no manipulation of incident rainfall. The TFE was trenched to 
between 1-2 m to remove the effect of through-flow of soil water; and to control 
for any temporary damage to roots from the trenching, the control plot was also 
trenched to the same depth. The TFE treatment has been installed and running 
continuously since January 2002 to the present, except for a 1-week period in 
November 2002 (full removal), a month period during the dry season in November 
2014 (sequential removal of all panel) and during 2004 (30% removal). 

No statistical methods were used to predetermine sample size, the experiments 
were not randomized and the investigators were not blinded to allocation during 
experiments and outcome assessment. 

Soil moisture data. In both the control and TFE plots there are soil access pits in 
which volumetric soil water content sensors (CS616, Campbell Scientific, Logan, 
USA) located at depths of 0, 0.5, 1, 2.5 and 4m, monitor soil moisture every hour 
(see Fisher et al.*° for full methodology). New data presented here are from March 
2008-December 2014, averaged into monthly values. Due to equipment failure, 
some soil moisture data are missing for August and December 2013, for 2008 and 
2010 on the control plot, and for November and December 2013 on the TFE plot. 
Volumetric soil water content was converted in to soil water potential (W,) using 
the necessary van Genuchten parameters previously calculated by Fisher et al.*! 
based on soil hydraulics measurements at this site. 

Biomass data. Trees > 10cm diameter at breast height (dbh) on both the control 
and the TFE plots were tagged and identified to species level in September 2000. 
Diameters were measured on these trees at 1.3 m, unless buttress roots were pres- 
ent, in which case the measurement was made above the buttressing. Individuals on 
the plots were re-censused at varying intervals from January 2001, until November 
2014. During each census the trees were also assessed to be either dead or alive. 
A tree was considered dead if leaflessness was accompanied by a persistent zero 
or negative stem increment, and/or the tree had snapped or fallen to the ground. 
Recruitment of new trees into the > 10cm dbh size class was enumerated in 2005, 
2009 and 2014. 

Trees in the 10 x 10m subplots adjacent to the trenches were excluded from 
our biomass and growth rate analysis to eliminate possible effects of changes in 
mortality resulting from root damage”. Consequently, 369 trees on the control and 
358 trees on the TEE, each in 0.64 ha were analysed and the biomass scaled to 1 ha. 
Following da Costa et al.° trees were grouped into small (10-20 cm dbh), medium 
(20-40 cm dbh) and large (>40cm dbh) size classes. 

Biomass was calculated using the Chave et al.** equation which uses diameter, 
wood density and environmental predictors. A mean and standard deviation of 
wood density for each tree was calculated from data in the global wood density 
database**4 and from Patifio et al.>°. Multiple estimates of wood density at the 
species, genus and family level were used to calculate standard deviations on our 
wood density estimates. Of all the trees on both plots, 68% had values for wood 
density at species level, 18% at genus level and 3% at family level; 11% of trees were 
not identified and were given a plot level average wood density with an associated 
standard deviation. A standard error on our biomass estimations was calculated, 
which accounted for the error associated with wood density estimation and vari- 
ations in commonly used allometric equations. Twelve calculations were used to 
calculate the error on our biomass values; these 12 biomass estimates were from 
combinations of four biomass equations: Chave et al.>? and Chave et al.*° both with 
and without height and three wood density estimates: mean wood density and 
mean wood density + one standard deviation. The allometric equations selected 
represent one of the most commonly used biomass equations for Amazonia” (with 
and without height as a predictor variable) and the most recent and most com- 
prehensive biomass equations for our study area*” (with and without height as a 
predictor variable). Measurements of height were not available and were calculated 
from an equation developed specifically for the region of the Amazon in which 
our plots are located*”. Mortality rate was calculated according to da Costa et al.° 
and separately for trees of 10-20cm dbh, 20-40 cm dbh and >40cm dbh on the 
TFE and control plot. 

Seasonal growth data. Dendrometers were installed at just above or below the 
point of dbh measurement on all trees >10cm dbh**. Circumference measure- 
ments from the dendrometer bands were made monthly to tri-monthly from 
January 2005 to November 2014, with the exception of a six month gap from July 
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2007 and January 2013. Seasonal shrinkage was calculated by taking the average 
growth rate of 19 trees on the control and TFE plots which experienced no overall 
growth, but demonstrated a seasonal pattern of shrinkage and expansion. Tropical 
trees can experience net diameter shrinkage in the dry season, due to a lack of 
growth, accompanied by reduced water content of stem tissues, with subsequent 
swelling when tissues are rehydrated in the wet season*’. An average pattern of 
shrinkage and expansion was therefore subtracted from all trees to ensure that dry 
season growth was not under-estimated and wet season growth over-estimated. 
Dendrometer increments were filtered to remove the growth spikes from measure- 
ment errors following Rowland et al.*®. Subsequent to this procedure, gap-filling 
was done using linear interpolation®’; on the control and TFE plot 16% and 17% 
of the data were gap-filled respectively. Growth rates per day calculated from the 
dendrometer measurements were averaged into three monthly periods to give 
continuous tri-monthly growth rates from 2005-2014. When trees in the 10 x 10m 
subplots adjacent to the trenches, and trees with poor quality dendrometer meas- 
urements throughout the 10 year study period for growth were excluded, 316 and 
310 trees on the control and TFE plot remained, respectively. All growth and mor- 
tality analyses were done using the R statistical package (version 3.1.2). Tests for 
significance were performed using Wilcoxon signed-rank test. 

Non-structural carbohydrate (NSC) analysis. Using samples cut by a tree climber 
from fully sunlit branches, three leaves and a branch sample of ~8-10 mm diam- 
eter were taken from 20 trees on the control plot and 21 trees on the TFE plot in 
November 2013, March 2014 and June 2015. The numbers and species of the trees 
selected for analysis are shown in Extended Data Table 1. The selected trees were 
all > 10cm dbh and represented the most common genera existing on both plots’; 
samples were not taken from the external 10 x 10m subplots to avoid any impacts 
of trenching. Samples of tree stem tissue from the same trees were taken with a 
5mm increment borer at breast height once in November 2013; this sampling was 
not repeated to avoid excessive damage incurred by repeated boring. We followed 
the enzymatic method proposed by Sevanto et al.* to analyse the NSC content. 
Here, NSC is defined as free, low-molecular-weight sugars (glucose, fructose and 
sucrose) and starch. Immediately after collection, samples were microwaved to 
stop enzymatic activity. After that, samples were oven-dried at 70°C for 24-48h 
and ground to fine powder. We prepared approximately 12 mg of plant material 
with 1.6 ml of distilled water for the analysis. We used amyloglucosidase from 
Aspergillus niger (Sigma-Aldrich) to digest total NSC to glucose, and invertase, 
glucose hexokinase kits (GHK) and phosphorus glucose (Sigma-Aldrich) to 
quantify the low molecular weight sugars. The concentration of free glucose was 
determined photometrically in a 96-well microplate spectrophotometer (BioTek, 
Epoch). NSC values are expressed as per cent of dry matter. For further method 
details, see Sevanto et al.*°. 

Xylem vulnerability to cavitation (percentage loss of conductivity; PLC) and 
leaf water potential measurements. Samples were cut from fully sunlit branches 
following the same protocol as above for NSC during late dry season (November 
2013), and using the same sample trees as used for the NSC analysis (Extended 
Data Table 1), with the exception of branches of the genus Manilkara. To maintain a 
balance of number of genera and trees sampled, four additional trees were therefore 
sampled from the genus Inga in the TFE and control plots for both sets of analyses. 
One to three 1.0-1.5 m long branches per tree were cut and left to rehydrate over- 
night under a black plastic bag in a bucket of water. Maximum vessel length was 
determined for one branch out of a set of 5-6 branches for each species by injecting 
low pressure air at the branch base under water and progressively re-cutting the 
stem until bubbles emerged (maximum conduit length varied between 25 and 
50cm across samples)*?. Axial slits were made in the branch segments selected for 
PLC analysis to increase the efficacy of the air injection. These partially debarked 
segments were mounted on a 4.6 cm long air injection apparatus. Water, filtered to 
0.2 um, flowed gravimetrically through the sample. After 10 min of equilibration 
at low pressure (10 kPa), the sample was pressurised for 20 min. The pressure was 
increased in steps of 0.3 to 0.5 MPa, with each step followed by 10 min relaxation 
and flow measured at a constant background 100 kPa air pressure, until a residual 
flow lower than 5% of the initial flow was found. Five to ten measurements of water 
flux were taken at the distal end. The interval for each conductivity measurement 
ranged from 2 to 10 min depending on stem length, conductivity and pressure head 
employed (normally 3 kPa). Segment lengths, cross-sectional diameters and leaf 
areas of the leaves subtended by the measured segment were determined. 

We employed a two-parameter Weibull function to model the changes in per 
cent loss of xylem hydraulic conductivity as a function of xylem pressure’”. The 
two parameters represented Ps and slope of the conductivity-pressure curve. We 
estimated Ps9 and slope for all trees using tree as a random factor in a mixed-model 
analysis (nlme library"*) in R (Version 3.02, R Core Team). We let Ps vary for each 
tree while keeping slope constant across trees to achieve convergence. We then 
employed these conditional estimates of P59 in a general linear model to test for 
the effects of plot, genus and dbh (Extended Data Table 2a). 
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We then confirmed the above results by running a second mixed-effect model, 
in which we accounted for random tree-by-tree variation and variance driven by 
phylogeny, by nesting individual trees within genera as the random component 
of the model and incorporating plot and dbh as fixed effects (Extended Data 
Table 2b). We tested whether tree diameter affected Ps) by comparing the per- 
formance of the full model with the same model without the effect of dbh on Pso 
using a likelihood ratio test, and by conducting simultaneous hypothesis tests at the 
95% significance level (library multcomp in R). For both analyses, distributional 
assumptions were tested by looking at plots of residuals for fixed and, if the case, 
random effects. 

Leaf water potential Y; was measured during two campaigns (one during 
the dry and one during the wet season, that is, October 2013 and May 2014, 
respectively). Two to four leaves were measured for each branch of each tree, 
following the same sampling protocol above for NSC and hydraulic measure- 
ments. During each campaign, ¥; measurements were conducted at pre-dawn 
(before 07:00) and at midday (between 11:30 and 13:00). Midday Y measure- 
ments of both campaigns were used in conjunction with the Ps9 and slope values 
determined for each genus to estimate percent loss of hydraulic conductivity 
(PLC), assuming xylem water potential to be equal to measured leaf water 
potential”. We acknowledge that this may overestimate PLC; however in 
counterpoint, only one seasonal campaign could be conducted and so the mini- 
mum of leaf \; for that year was thus likely to have been underestimated because 
fuller sampling through the dry season was impractical. In addition, earlier 
(2003) studies in the same experiment reported minimum values of stem xylem 
water potentials of around — 1.6 MPa, across control and TFE plots, suggesting 
that xylem values substantially more negative than those assumed here can be 
experienced. 

Herbivory. All leaf material from 25 x 1m? litter-traps on the control and TFE 
was collected 13 times from 2010-2014 at 3-6 month intervals, with one eight 
month interval in 2011. Each collection of leaf material represented two weeks of 
litter-fall in the forest. The 7,121 leaves collected from all 25 per plot litter-traps 
over the study period were scanned and the images were analysed according to 
Metcalfe et al.*° to calculate the percent leaf area lost to herbivore attack on the 
control and TFE. 

LAI measurements. LAI values from 2001 to 2007 are taken from Fig. 1f in 
Metcalfe et al.3®. For 2009-2014 LAI was measured at the same 25 permanent 
points*® on a grid throughout the control and TFE plot. Measurements were made 
using hemispherical photos taken every 3-6 months from 2009-2014. Photos were 
taken before sunrise (~6:00) from a height of 1.5 m on the control plot and 2m 
(above the TFE structure) on the TFE plot. The 25 photos per plot were analysed 


together using the CAN_EYE software (INRA, Avignon, France). Standard errors 
were calculated using three different estimates of LAI given by the CAN_EYE 
software. 
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Extended Data Figure 1 | Leaf area index change. Leaf area index (LAI; ratio of leaf area to ground area) for the period of 2001-2014 on the control 
(black, solid) and TFE (grey, dashed) plots. Error bars show the s.e.m. associated with LAI calculation, which is derived from n = 25 photos per control 
and TFE plot (see Methods). 
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Extended Data Figure 2 | Seasonal soil water potential. Average soil collected from 2008-2014, using sensors installed 0, 0.5, 1, 2.5 and 4m 
water potential (— MPa) in the control and TFE during dry season below the surface and the necessary van Genuchten parameters previously 
(July-December, control n = 34 months, TFE n = 40 months) and wet calculated from soil hydraulics measurements at this site (see Methods). 
season (January-June control n = 34 months, TFE n = 40 months), Error bars show s.e.m. 


calculated form monthly average volumetric soil moisture content data, 
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Extended Data Figure 3 | Leaf herbivory comparison. Average analysis of herbivore attack on 13,694 top-canopy living leaves from 
percentage loss of leaf area from herbivore attack calculated from leaves branches of the 41 trees used for the Psp analysis support these results, also 
collected in litter-traps on the control (n = 3,297) and TFE plot (n= 3,824) showing no significant differences in percentage herbivory between the 
from 2010-2014. Error bars show s.e.m and no significant differences control and the TFE (data not shown). 


were found significant with a P < 0.05 using the Wilcoxon test. A separate 
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Extended Data Figure 4 | Diurnal patterns of ©. Diurnal Y measured 
every 2h from 6:00 until 18:00 in dry season on trees accessible from the 
walk up tower. Each box shows the diurnal VY against diurnal air vapour 
pressure deficit (VPD) from one of seven trees accessible on the control 


0.5 1.0 


1.5 


TFE2 


TFE3 


oie) 


TFE4 


C5 


C6 


C7 


TFE1 


C1 


C2 


C3 


C4 


oo) 


Te) 


0.5 1.0 


1.5 


(C), or one of four trees accessible on the TFE. Note that a majority of 


© 2015 Macmillan Publishers Limited. All rights reserved 


VPD (MPa) 


trees demonstrate an inversely correlated (negative) relationship with 
VPD. Combined separately for each plot, a significant negative linear 
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relationship is observed between VW and VPD on the control (R? = 0.18, 


P= 0.002) and even more strongly on the TFE (R? = 0.33, P= 0.001). 


Extended Data Table 1 | NSC and Pso sample trees 


Genus 
Eschweilera 


Licania 


Manilkara 


Pouteria 


Protium 


Swartzia 


Inga 


Species 
grandiflora 


coriacea 


pedicellata 


membranacea 


octandra 


bidentata 


anomala 


tenuifolium 


paniculatum 


racemosa 


alba 


Number 
1 
2 


Ps (MPa) 
0.6 
1.0 


1.0 


Number 


oahWN = 


TFE 
dbh (cm) 


LETTER 


Ps (MPa) 


The genus, species and number of trees > 10cm dbh sampled for NSC and Pso from the central 0.64 ha area of each of the control and TFE plots, and their P59 value. The dbh (in cm) of the sample 
trees is shown in brackets. Where possible, trees from the most common species within the most common genera were sampled; when this was not possible a second species within the same genus 
was sampled. The genera shown here represent seven of the most abundant genera found across both plots?. The samples from the genus /nga were only employed for leaf water potential and Pso 
measurements to replace Manilkara samples from which Pso data were unobtainable. 
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Extended Data Table 2 | Analysis of the effect of tree dbh on xylem Pso 


Degrees of Sum Mean F value Probability 
freedom Squares Square 
Tree dbh 1 6.139 6.1390 9.1974 0.0051 gis 
Plot 1 0.095 0.0950 0.1423 0.7087 ns 
Genus 5 32.917 6.5834 9.8632 1.4 e-5 oe 
Residuals 29 19.357 0.6675 
Model Degrees’ AIC Log- Likelihood P-value 
covariates of likelihood ratio 
freedom 
without 6 2793 -1393 28.2 <0.0001 
tree dbh 
with tree 4 2769 -1378 
dbh 


Top, analysis of variance table from the general linear model testing for the effects of plot, genus and dbh on xylem Pso. Bottom, difference in maximum likelihood for the mixed-model with and without 
tree size as a predictor of xylem Pso (See Methods). 
The table shows the main parameters and the significance level of the x? test of the likelihood ratio test. ns, not significant; **, highly significant; ***, very highly significant. 
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Extended Data Table 3 | Individual tree mortality by genus 


control plot TFE plot 
Genus No. Year dead Genus No. Year dead 
Aspidosperma 1 2007 Abuta 1 2014 
Buchenavia 1 2003 Bauhinia 1 2009 
Couratari 3 2013,2013,2014 Brosimum 1 2005 
Duguetia 2 2004,2005 Chimarrhis 1 2014 
Eschweilera 3 2009, 2013, 2013 Couepia 1 2005 
Eugenia 1 2001 Dendrobangia 2 2005,2005 
Franchetella 1 2014 Derris 1 2005 
Guatteria 2 2005,2005 Doliocarpus 1 2009 
Helicostylis 1 2007 Erisma 1 2006 
Inga 2 2009,2013 Eschweilera 3 2006 ,2007 ,2007 
Iryanthera 1 2013 Forsteronia 1 2009 
Licania 4 2002,2008,2013,2013 Goupia 4 2002,2004,2013,2014 
Micropholis 4 2002 Guatteria 2 2008,2009 
Minquartia 2 2003,2004 Hirtella 1 2009 
NI 4 2001 ,2002,2013,2013 Inga 4 2005,2008,2008,2014 
Parkia 1 2014 Iryanthera 1 2014 
Pouteria 4 2003,2007 ,2009,2010 Lecythis 5 2004,2009,2009,2013,2013 
Protium 4 2005,2009,2014,2014 Licania 2 2008, 2009 
Rinoria 4 2004 ,2009,2013,2014 Machaerium 1 2003 
Sclerolobium 1 2005 Manilkara 4 2004 ,2004,2008,2014 
Stachyarrhena 1 2014 Marmaroxylon 1 2001 
Stryphnodendron 1 2006 Mezilaurus 1 2008 
Swaritzia 4 2002,2003,2007,2014 Micropholis 4 2003,2005,2007,2009 
Tapura 1 2014 Minquartia 1 2009 
Tetragastris 2 2006,2013 Naucleopsis 1 2002,2007 2014 
Vantanea 1 2003 Newtonia 1 2009 
Virola 1 2006 NI 2 2002,2013 
Vouacapoua 2 2013,2014 Ocotea 4 2002,2008,2009,2013 
Oenocarpus 1 2008 
Ormosia 2 2001, 2009 
Ouratea 4 2006 ,2007 ,2009,2014 
Pouteria 6 2005,2005,2007, 
2009 ,2013,2014 
Pradosia 1 2005 
Protium 5 2005,2007 ,2007,2009,2009 
Pseudolmedia 2 2002, 2006 
Quararibea 1 2009 
Sclerolobium 1 2014 
Stachyarrhena 3 2007,2013,2014 
Swartzia 1 2004 
Symphonia 1 2014 
Tetragastris 1 2006, 2013, 2014 
Xylopia 5 2002,2004,2005,2014,2014 


The number (No.) of dead trees per genus and the year of death (Year dead) for trees on the control and the TFE, excluding the outer subplots (see Methods). Trees not identified to genus are 


shown as NI. 
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Reversal of phenotypes in MECP2 duplication mice 
using genetic rescue or antisense oligonucleotides 


Yehezkel Sztainberg!?, Hong-mei Chen**, John W. Swann**°, Shuang Hao, Bin Tang*, Zhenyu Wu?”, Jianrong Tang”, 
Ying-Wooi Wan*®, Zhandong Liu?®, Frank Rigo’ & Huda Y. Zoghbi!*>® 


Copy number variations have been frequently associated with 
developmental delay, intellectual disability and autism spectrum 
disorders'. MECP2 duplication syndrome is one of the most 
common genomic rearrangements in males? and is characterized 
by autism, intellectual disability, motor dysfunction, anxiety, 
epilepsy, recurrent respiratory tract infections and early death*>. 
The broad range of deficits caused by methyl-CpG-binding protein 
2 (MeCP2) overexpression poses a daunting challenge to traditional 
biochemical-pathway-based therapeutic approaches. Accordingly, 
we sought strategies that directly target MeCP2 and are amenable 
to translation into clinical therapy. The first question that we 
addressed was whether the neurological dysfunction is reversible 
after symptoms set in. Reversal of phenotypes in adult symptomatic 
mice has been demonstrated in some models of monogenic loss-of- 
function neurological disorders, including loss of MeCP2 in Rett 
syndrome’, indicating that, at least in some cases, the neuroanatomy 
may remain sufficiently intact so that correction of the molecular 
dysfunction underlying these disorders can restore healthy 
physiology. Given the absence of neurodegeneration in MECP2 
duplication syndrome, we propose that restoration of normal 
MeCP2 levels in MECP2 duplication adult mice would rescue 
their phenotype. By generating and characterizing a conditional 
Mecp2-overexpressing mouse model, here we show that correction 
of MeCP2 levels largely reverses the behavioural, molecular and 
electrophysiological deficits. We also reduced MeCP2 using an 
antisense oligonucleotide strategy, which has greater translational 
potential. Antisense oligonucleotides are small, modified nucleic 
acids that can selectively hybridize with messenger RNA transcribed 
from a target gene and silence it!®1!, and have been successfully 
used to correct deficits in different mouse models!*-!®. We find that 
antisense oligonucleotide treatment induces a broad phenotypic 
rescue in adult symptomatic transgenic MECP2 duplication mice 
(MECP2-TG)!”°, and corrected MECP2 levels in lymphoblastoid 
cells from MECP2 duplication patients in a dose-dependent manner. 

To determine whether MECP2 duplication syndrome is reversible, 
we generated a conditional MECP2 overexpression mouse model 
that carries two functional alleles with species-matched endogenous 
control elements: a human wild-type MECP2 allele, and a conditional 
mouse Mecp2 allele (Mecp2') that can be deleted using tamoxifen- 
inducible Cre recombination (Fig. 1a). Our breeding strategy resulted 
in FVB/N x C57B1/6 F1 hybrid mice belonging to the following three 
genotypes: Flox, Flox;TG and Flox;TG;Cre. The loxP sequences did 
not alter MeCP2 expression or phenotype as Flox and Flox;TG mice 
were indistinguishable from wild-type and transgenic (TG) mice, 
respectively, in both molecular and behavioural assays (Extended Data 
Figs 1 and 2). To ascertain the efficiency of Cre-mediated recombi- 
nation, we injected Flox;TG;Cre mice intraperitoneally with either 


tamoxifen (TMX) or vehicle over the course of 4 weeks (Fig. 1b), and 
euthanized four cohorts of mice at different time points after initiation 
of treatment. MeCP2 protein levels were significantly downregulated at 
2.5 weeks, and the levels of MeCP2 remained low thereafter (Fig. 1c, d). 
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Figure 1 | Inducible Cre-lox recombination normalizes MeCP2 levels in 
adult MECP2 duplication mice. a, Breeding strategy to generate conditional 
MECP2 overexpression mice. To mediate recombination, we used a Cre 
recombinase driven by a ubiquitin C promoter and fused to a modified 
human oestrogen receptor (UBC-cre/ERT2). b, Tamoxifen (TMX) treatment 
protocol and the time points for western blot (WB), immunofluorescence 
(IF) and RT-qPCR. c, Western blot from cortical samples at 6 weeks (for gel 
source data, see Supplementary Fig. 1). d, Kinetics of MeCP2 levels (n= 6; 
for gel source data, see Supplementary Fig. 1). e, RT-qPCR from cortical 
samples with specific primers for human or mouse Mecp2, and for each 

of the two alternatively spliced isoforms (n =6). f, Immunostaining for 
MeCP2 in hippocampal slices. NS, not significant. Data are mean + s.e.m. 
**P < 0,01; ***P < 0.001 (two-tailed t-test). 
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Figure 2 | Genetic normalization of MeCP2 levels reverses deficits in 
adult MECP2 duplication mice. a, Reversal of hypoactivity and anxiety- 
like behaviours in the open field. b, Representative tracking plots of the 
open field test. c, Reversal of anxiety-like behaviour in the elevated plus 
maze test. d, Reversal of abnormal motor behaviour on the rotarod test 
(asterisks indicate significant difference between Flox;TG;Cre-TMX and 
Flox;TG;Cre-vehicle groups). D1T1, day 1, trial 1. e, Reversal of impaired 
social behaviour in the 3-chamber test. f, No preference for the left (L) 

or right (R) chambers in the habituation phase of the test. n = 19 for 
Flox-TMX and Flox;TG-TMX groups; n = 14 for Flox;TG;Cre-vehicle 

and Flox;TG;Cre-TMX groups. g, Transcriptional heat map for the 
hippocampus (n= 3). h, TMX treatment normalized LTP in Flox;TG;Cre 
mice (n= 6-7 mice, 15-20 slices). Top, representative electrophysiological 
traces at baseline and after high-frequency stimulation (HFS). 

i, Quantification of the last 10 min of the LTP recording (n = 6-7 mice, 
15-20 slices). Data are mean + s.e.m. *P < 0.05; **P< 0.01; ***P< 0.001 
(two-tailed t-test (a, c, e, f, i), and repeated-measures two-way analysis of 
variance (ANOVA) followed by Tukey honest significant difference (HSD) 
post hoc correction for multiple comparisons (d, h)). 


Moreover, quantitative reverse transcription PCR (RI-qPCR) showed 
that Cre-mediated recombination efficiently downregulated mRNA 
levels of both alternatively spliced isoforms (Mecp2-e1 and Mecp2-e2) 
of the floxed mouse Mecp2, but not the human transgenic MECP2 allele 
(Fig. le). Finally, we confirmed the normalization of MeCP2 levels by 
immunofluorescence staining of hippocampal slices (Fig. 1f). 

Next, we injected a new cohort of 8-9-week-old mice with TMX or 
vehicle for behavioural characterization. Flox;TG;Cre mice injected 
with TMX (Flox;TG;Cre-TMX) were indistinguishable from Flox con- 
trol mice in the different assays, showing a resolution of the phenotypes 
that resemble MECP2 duplication syndrome, such as hypoactivity, 
anxiety-like behaviour, motor abnormalities and social behaviour 
deficits (Fig. 2a-f). 

Changes in MeCP2 abundance affect the mRNA levels of thou- 
sands of genes in the brain*”-”’. Therefore, we proposed that normal- 
izing MeCP2 levels would also normalize gene expression patterns. 
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First, we analysed the expression of selected MeCP2-sensitive genes 
in the hypothalamus”! and the cerebellum” by RT-qPCR, in adult 
mice. The mRNA levels of these genes in the Flox;TG;Cre-TMX group 
were indistinguishable from the Flox control group (Extended Data 
Fig. 3a, b). We then performed whole transcriptome sequencing (RNA- 
seq) analysis to evaluate expression patterns in the hippocampus. The 
analysis showed that the mRNA expression profile of the Flox;TG; 
Cre-TMX group clustered together with the Flox control group 
(Fig. 2g and Supplementary Table 1). Reducing MeCP2 to normal levels 
in symptomatic mice thus seems to rescue the behavioural phenotype 
by reversing pathogenic molecular changes in the brain. 

MeCP2 levels also influence synaptic plasticity, as indicated by 
abnormalities in hippocampal long-term potentiation (LTP) in 
MeCP2-null** and MECP2-TG mice’. We therefore assessed LTP 
at the Schaffer collateral synapses of the CA1 area in the hippocam- 
pus, and found no difference in the input-output relationship or 
paired-pulse facilitation between the groups, indicating that MeCP2 
overexpression does not affect basal synaptic transmission or short- 
term plasticity (Extended Data Fig. 4a—d). However, normalization 
of MeCP2 levels in Flox;TG;Cre-TMX mice completely rescued the 
abnormal, enhanced LTP (Fig. 2h, i). Defects in long-term hippocam- 
pal synaptic plasticity induced by MeCP2 overexpression are thus 
reversible in adult mice. 

To determine whether we could normalize MeCP2 levels using a 
strategy that is more readily translatable into a medical therapy, we 
took advantage of our MECP2-TG mice containing one copy of human 
MECP2 (in addition to the endogenous mouse gene) to screen for a 
treatment using human-specific antisense oligonucleotides (ASOs). We 
tested ASOs designed to bind several regions of the human MECP2 pre- 
cursor mRNA (pre-mRNA) so as to reduce the levels of both alterna- 
tively spliced MECP2 isoforms, MECP2-e1 and MECP2-e2 (Extended 
Data Fig. 5a). After screening MECP2 ASOs for their ability to reduce 
MECP2 levels in cultured human cells, and for toxicity in wild-type 
mice (data not shown), we screened five selected MECP2 ASOs by 
injecting them stereotaxically into the brain of MECP2-TG mice 
(Extended Data Fig. 5 and Extended Data Table 1), and used the most 
effective one, ASO-5, for further studies. 

To determine the duration of treatment efficacy, we gradually infused 
ASO into the brains of 7-8-week-old mice using micro-osmotic pumps 
designed to deliver the molecule at a constant rate over a 4-week period 
(Fig. 3a and Extended Data Fig. 6). At the end of treatment, immu- 
nofluorescence staining showed that the ASO was widely distributed 
throughout the brain (Fig. 3b) and that it effectively knocked down 
MeCP2 to close to wild-type levels (Fig. 3c). We next analysed MeCP2 
expression in the cortex at different time points after initiation of treat- 
ment by western blot (as described in Fig. 3a). MeCP2 was significantly 
downregulated 4 weeks after the initiation of the treatment (Fig. 3d, e), 
and remained so for an additional 4 weeks after stopping the infusion 
(Fig. 3e). We further confirmed the specificity of the ASO for human 
MECP2 by RT-PCR (Fig. 3f). 

We then treated a new cohort of animals for behavioural character- 
ization. At 6-7 weeks after the initiation of the treatment, rescue was 
evident only in the rotarod test (Fig. 3i and Extended Data Fig. 7), but 
by 10-11 weeks, the hypoactivity, anxiety-like behaviour and social 
behaviour of MECP2-TG mice were also reversed (Fig. 3g, h, j). Ten 
weeks after treatment cessation, when MeCP2 levels had increased 
to pre-treatment levels, the symptoms reappeared (data not shown). 
To gain insight into the basis of the delayed behavioural rescue, we 
performed RNA-seq analysis in the hippocampus at two different 
time points after treatment initiation (Fig. 4a and Supplementary 
Table 2). We found that 4 weeks after initiation of treatment, there 
was a trend towards normalization of the expression of some mRNAs, 
but the transgenic-ASO group did not cluster together with the 
wild-type group. By 8 weeks, however, the transgenic-ASO group 
clustered together with the wild-type group, suggesting that rever- 
sal of pathogenic molecular changes in the brain as a consequence 
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Figure 3 | Gradual infusion of ASO normalizes MeCP2 levels and 
reverses abnormal behaviour. a, Timeline of gradual ASO treatment 

and western blot, immunofluorescence, RNA-seq and behavioural tests. 
3-CH, 3 chamber; EPM, elevated plus maze; OFA, open field activity. 

b, Immunofluorescence staining against ASOs. Bs, brainstem; cb, 
cerebellum; ctx, cortex; hipp, hippocampus; ob, olfactory bulb; st, striatum. 
c, Immunofluorescence staining for MeCP2. d, Representative western blot 
of MeCP2 in the cortex 4 weeks after initiation of treatment (for gel source 
data, see Supplementary Fig. 2). e, Kinetics of MeCP2 levels (n = 4-5; 

for gel source data, see Supplementary Fig. 2). f, RT-qPCR from cortical 
samples with specific primers for mouse or human MECP2 and for each 

of the two alternatively spliced MECP2 isoforms (n = 3). g, Reversal of 
hypoactivity and anxiety-like behaviour in the open field at week 10. 

h, Reversal of anxiety-like behaviour in the elevated plus maze test at 

week 10. i, Reversal of abnormal motor behaviour on the rotarod test at 
week 7. Data were averaged per day over four trials. j, Reversal of abnormal 
social behaviour in the 3-chamber test at week 11. n= 19 for wild-type 
group; n= 16 for transgenic (TG) group; n= 15 for TG-ASO group. Data 
are mean+s.e.m. *P< 0.05; **P< 0.01; ***P < 0.001 (two-tailed t-test 

(e, f), and one-way ANOVA followed by Fisher’s least significant difference 
(LSD) post hoc test (g-j)). 


of MeCP2 normalization correlates strongly with resolution of the 
behavioural phenotype. To test for possible off-target effects, we com- 
pared expression profiles at both time points to detect genes whose 
expression was significantly affected by the ASO treatment (trans- 
genic versus transgenic-ASO), but was not different between wild- 
type and transgenic mice. We found only 10 overlapping genes that 
meet this criterion (see Supplementary Table 2), suggesting minimal 
off-target effects. 

Seizures occur in MECP2 duplication syndrome mice as they age’”, 
so we tested the ability of ASO treatment to reverse abnormal elec- 
trographic discharges in 25-35-week-old MECP2-TG1 mice (pure 
FVB/N background). Vehicle-treated MECP2-TG1 mice manifested 
electrographic seizure spikes in electroencephalography (EEG) record- 
ings from the cortex (Fig. 4b and Extended Data Fig. 8), and strong 
electrographic seizure events were typically accompanied by behav- 
ioural seizures (Supplementary Video 1). ASO treatment abolished 
these abnormal EEG discharges and eliminated behavioural seizures 
and electrographic seizure spikes in this group (Fig. 4b and Extended 
Data Fig. 8). 
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Figure 4 | ASO treatment corrects abnormal gene expression and 

EEG. a, Transcriptional heat maps of the hippocampus (n = 3 mice per 
group). b, Representative EEG traces (n = 3 for wild-type group; n = 4 for 
transgenic group, n= 6 for transgenic-ASO group). LF, left frontal cortex; 
LP, left parietal cortex; RP, right parietal cortex. c, ASO-corrected MECP2 
mRNA levels in lymphoblastoid cells from MECP2 duplication patients 
(n=5). Asterisks denote statistical difference to the control group. Data 
are mean + s.e.m. *P < 0.05; **P < 0.01; ***P< 0.001 (one-way ANOVA 
followed by Tukey HSD post hoc correction for multiple comparisons). 


Lastly, we determined that MECP2-ASO treatment corrected 
MECP2 mRNA levels in a dose-dependent manner in lymphoblastoid 
cells from MECP2 duplication patients (Fig. 4c and Extended Data 
Table 2). The amount of ASO can thus be titrated and optimized so as 
to target only some MECP2 RNA and allow translation of physiological 
levels of the protein. 

As the main ‘reader’ of methylated cytosines**, MeCP2 has a fun- 
damental role in epigenetics, controlling chromatin states and the 
expression of thousands of genes*?>8, Accordingly, MeCP2 expression 
must be maintained within a fairly narrow range to assure proper gene 
expression and neuronal function?”. Here we demonstrated that resto- 
ration of MeCP2 to its normal level can largely reverse the phenotype 
of adult symptomatic MECP2 duplication mice. 

It is worth noting that reversal of MECP2 duplication-like features 
was evident 6-7 weeks after genetic rescue, but took 10-11 weeks 
after initiation of ASO treatment. This 4-week difference is probably 
due to the more gradual reduction in MeCP2 levels brought about by 
ASO treatment. The RNA-seq results in the ASO experiment corre- 
lated well with the resolution of the disease phenotype: abnormal gene 
expression was still prominent 4 weeks after the initiation of treat- 
ment (Fig. 4a), whereas robust correction of gene expression 8 weeks 
after treatment initiation (and 4 weeks after MeCP2 levels normalized 
(Fig. 3e)) was accompanied by full behavioural rescue. In mice, there- 
fore, MeCP2 levels must be normal for 1 month before there is resolu- 
tion of the duplication phenotypes. 

The finding that ASO treatment rescues the MECP2 duplication-like 
phenotypes to a similar extent as the genetic rescue provides a proof- 
of-concept about the value of this approach. To move this closer to 
translation, further studies will have to test different ASO dosages and 
establish the safety margin of MeCP2 levels, using a mouse model that 
exclusively expresses two human MECP2 alleles. Additionally, we will 
screen thousands of MECP2 ASOs for off-target effects. 

Overall, our results show that delivering ASOs to the central nerv- 
ous system is a promising therapeutic approach for treating MECP2 
duplication syndrome, and has potential for other disorders caused 
by duplication of genetic material by targeting genes in the respec- 
tive critical regions, such as peripheral myelin protein 22 (PMP22) 
in Charcot-Marie-Tooth disease7’, retinoic acid induced receptor 1 
(RAI1) in Potocki-Lupski syndrome”’, and dual specificity tyrosine- 
phosphorylation-regulated kinase 1A (DYRK1A) in Down syndrome””. 
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METHODS 

ASO synthesis. Isis Pharmaceuticals synthesized all ASOs as previously 
described'*. All ASOs consist of 20 chemically modified nucleotides (MOE gap- 
mer). The central gap of 10 deoxynucleotides is flanked on its 5’ and 3’ sides by 
five 2’-O-(2-methoxyethyl) (MOE)-modified nucleotides. The backbone modi- 
fications from 5’ to 3’ are: 1-PS, 4-PO, 10-PS, 2-PO and 2-PS. Phosphorothioate 
(PS) modifications were replaced with native phosphodiester (PO) in the MOE 
wings to reduce the overall PS content of the ASO, since a fully modified PS ASO 
is not necessary for robust CNS activity'®. The sequence of each ASO is listed in 
Extended Data Table 1. 

Mice. The first MeCP2-overexpressing mice were MECP2-TG mice on a FVB/N 
pure background’. These mice show normal locomotion in the open field and 
an increase in vertical activity, which was interpreted as less anxiety, but there 
was no difference in anxiety-like behaviour in the light-dark test. Mice on a pure 
FVB/N background, however, develop premature retinal degeneration, which can 
confound the interpretation of some behavioural tests*!. To overcome issues related 
to a pure inbred strain, our laboratory characterized F, hybrid MECP2-TG mice 
(FVB/N x C57BI1/6 or FVB/N x 129S6/SvEv), and showed that these mice display 
several phenotypes as early as 7 weeks of age”, including increased anxiety and 
a trend towards hypoactivity. Therefore, for both the genetic rescue and the ASO 
treatment experiments, we decided to continue using F) hybrid MECP2-TG mice. 

For experiments related to the validation of the Flox;sTG mouse model 
(Extended Data Figs 1 and 2), we generated F, hybrid animals by mating male 
MECP2-TG1 mice on a pure FVB/N background’ to female mice heterozy- 
gous for the Mecp2!* allele (Flox) (B6;129S4-Mecp2!""?/Mmucd obtained from 
MMRRC, and backcrossed to C57B1/6] for more than 10 generations; see also 
scheme in Extended Data Fig. 1a). For experiments related to conditional rescue 
of MECP2-TG mice (Figs 1 and 2), we first mated Flox C57Bl/6 females with 
C57BI/6 Cre-ER males (B6.Cg-Tg(UBC-cre/ERT2)1Ejb/J obtained from Jackson 
Laboratories). The F; Flox;Cre females were then mated to FVB/N MECP2-TG1 
males to generate the F, hybrid, triple-transgenic Flox;TG;Cre male mice and their 
control littermates (Flox and Flox;TG) (see scheme in Fig. 1a). For studies related 
to MECP2-ASOs, we generated F, hybrid animals by mating FVB/N MECP2-TG1 
females and wild-type 129S6/SvEv male mice (Taconic Farms). For the EEG exper- 
iment (Fig. 4b), we used MECP2-TG1 males on a pure FVB/N background. 

We routinely used mouse littermates as controls for our experiments. 
Throughout the experiments, mice were maintained in a temperature-controlled, 
AALAS-certified level 3 facility on a 12 h light-dark cycle. Food and water were 
given ad libitum. All procedures to maintain and use these mice were approved by 
the Institutional Animal Care and Use Committee for Baylor College of Medicine. 
Animals were randomly selected using Excel software to generate a table of random 
numbers for all genetic and treatment studies. For all experiments, the individuals 
performing the behavioural and electrophysiological studies were blinded to the 
genotype or treatment. 

Preparation of brain lysates and western blot. Brains were dissected and homog- 
enized in cold lysis buffer (20mM Tris-HCl, pH 8.0, 180 mM NaCl, 0.5% NP-40, 
1mM EDTA and Complete Protease Inhibitor, Roche). Lysates were rotated 
for 20 min at 4°C. After centrifugation at 4°C, the supernatant was mixed with 
NuPAGE sample buffer, heated for 5 min at 95°C, and run on a NuPAGE 4-12% 
Bis-Tris gradient gel with MES SDS running buffer (NuPAGE). Proteins were 
transferred to a nitrocellulose membrane using NuPAGE Transfer Buffer for 1.5h 
at 4°C. The membrane was blocked for 1h with 5% milk in TBS with 2% Tween- 
20 (TBST) followed by overnight incubation with primary antibody at 4°C. After 
four 10-min washes with TBST, the membrane was incubated with secondary 
antibody for 1-2h at room temperature. Horseradish peroxidase (HRP) was 
detected using SuperSignal West Dura kit, Thermo Scientific. Western blot images 
were acquired by ImageQuant LAS 4000 (GE Healthcare) and quantified by an 
Image] software package. Primary antibodies: rabbit antiserum raised against the 
amino terminus of MeCP2 (1:5,000; Zoghbi laboratory), mouse anti-GAPDH 6C5 
(1:20,000; Advanced Immunochemicals, 2-RGM2). Secondary antibodies: goat 
anti-rabbit HRP (1:20,000; Bio-Rad), donkey anti-mouse HRP (1:20,000; Jackson 
ImmunoResearch Labs, 715-035-150). 

Gene expression analysis by RT-qPCR. The subset of mice for RT-qPCR was 
selected randomly, using Excel software to generate a table of random numbers. 
No significant differences on behavioural measurements were found between the 
selected mice and the rest (per genotype group). Total RNA from mouse brain 
tissue was extracted using miRNeasy minikit (Qiagen), and 11g of total RNA 
was used to synthesize cDNA by Quantitect reverse transcription kit (Qiagen). 
For human lymphoblasts, 21g of total RNA was used to synthesize cDNA. RT- 
qPCR was performed in a CFX96 Real-Time System (Bio-Rad) using PerfeCTa 
SYBR Green Fast Mix (Quanta Biosciences). Sense and antisense primers were 
selected to be located on different exons, and the RNA was treated with DNase, 
to avoid false-positive results caused by DNA contamination. The specificity of 
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the amplification products was verified by melting curve analysis. All RT-qPCR 
reactions were conducted in technical triplicates and the results were averaged 
for each sample, normalized to Hprt levels, and analysed using the comparative 
AAC, method. The following primers were used in the RT-qPCR reactions: 
MECP2 (common to human and mouse): 5‘-TATTTGATCAATCCCCAGGG-3! 
(sense), 5’/-CTCCCTCTCCCAGTTACCGT-3’ (antisense); MECP2 (human- 
specific): 5’-GATGTGTATTTGATCAATCCC-3’ (sense), 5’-TTAGGGTC 
CAGGGATGTGTC-3’ (antisense); Mecp2-el (mouse-specific): 5’-AGGAGA 
GACTGGAGGAAAAGTC-3’ (sense), 5/-CTTAAACTTCAGTGGCTTGTCT 
CTG-3’ (antisense); Mecp2-e2 (mouse-specific): 5’-CTCACCAGTTCCTG 
CTTTGATGT-3’ (sense), 5’/-CTTAAACTTCAGTGGCTTGTCTCTG-3’ (anti- 
sense); MECP2-e1 (human-specific): 5’-AGGAGAGACTGGAAGAAAAGTC-3! 
(sense), 5/-CTTGAGGGGTTTGTCCTTGA-3’ (antisense); MECP2-e2 (human- 
specific): 5’-CTCACCAGTTCCTGCTTTGATGT-3’ (sense), 5‘/-CTTGAGG 
GGTTTGTCCTTGA-3’ (antisense); Hprt (mouse-specific): 5'-CGGGGG 
ACATAAAAGTTATTG-3’ (sense), 5’-TGCATTGTTTTACCAGTGTCAA-3’ 
(antisense); HPRT (human-specific): 5’-GACCAGTCAACAGGGGACAT-3/ 
(sense), 5‘-CCTGACCAAGGAAAGCAAAG-3’ (antisense); Sst: 5’-CCC 
AGACTCCGTCAGTTTCT-3’ (sense), 5’/-GAAGTTCTTGCAGCCAGCTT-3’ 
(antisense); Crf: 5‘-TACCAAGGGAGGAGAAGAGA-3’ (sense), 5’-GATC 
AGAACCGGCTGAGGT-3’ (antisense); Npbwr1: 5‘-TCTCTTACTTCATC 
ACCAGCC-3’ (sense), 5/-GCATAGAGGAAAGGGTTGAG-3’ (antisense); Gamt: 
5'-GGATTATTGAGTGCAATGATGG-3’ (sense), 5’-TCAAGGGAACAA 
CCTTATGTG-3’ (antisense); Agrp: 5‘-TCAAGAAGACAACTGCAGAC-3/ 
(sense), 5'-TCTGTGGATCTAGCACCTC-3’ (antisense); Rcor2: 5'-AC 
CCGAAGTCGAACTAGTG-3’ (sense), 5‘-CTAGTTCATCACTGTCTTCTTTG-3’ 
(antisense); Prl2c2: 5'-CATGAGCACCATGCTTCAG-3’ (sense), 5’-GCG 
AGCATCTTCATTGTCAG-3’ (antisense). 
Immunofluorescence. Animals were anaesthetized with a mix of ketamine 
37.6mgml ', xylazine 1.92 mg ml! and acepromazine 0.38 mg ml ', and tran- 
scardially perfused with 20 ml PBS followed by 100 ml of cold PBS-buffered 4% 
paraformaldehyde (PFA). The brains were removed and post-fixed overnight in 4% 
PFA. Next, brains were cryoprotected in 4% PFA with 30% sucrose at 4°C for two 
additional days and embedded in Optimum Cutting Temperature (O.C.T.,, Tissue- 
Tek). Free-floating 40-j1m brain sections were cut using a Leica CM3050 cryostat 
and collected in PBS. The sections were blocked for 1h in 2% normal goat serum, 
0.3% Triton X-100 in PBS at room temperature. Sections were then incubated 
overnight at 4°C with either rabbit anti-MeCP2 antibody (1:1,000; Cell Signaling) 
or rabbit anti-ASO antibody (1:10,000; Isis Pharmaceuticals). The sections were 
washed three times for 10 min with PBS, and incubated for 3h at room tempera- 
ture with goat anti-rabbit antibody (1:500; Alexa Fluor 488, Invitrogen, A-11034). 
Sections were washed again three times for 10 min with PBS and mounted onto 
glass slides with Vectashield mounting medium with DAPI (Vector Laboratories). 
Tamoxifen treatment. Tamoxifen (Sigma-Aldrich, T5648) was dissolved to 
20mg ml! in peanut oil, aliquotted and frozen at —20°C until use. Peanut oil was 
also used as a vehicle. Tamoxifen or vehicle was injected intraperitoneally at a dose 
of 100mgkg"!, three alternative days a week for 4 weeks (as described in Fig. 1b). 
Behavioural assays. All data acquisition and analyses were carried out by an 
individual blinded to the genotype and treatment. All behavioural studies were 
performed during the light period. Mice were habituated to the test room for 1h 
before each test. At least one day was given between assays for the mice to recover. 
All the tests were performed as previously described”? with few modifications. 
Open field test. After habituation in the test room (150 1x, 60 dB white noise), mice 
were placed in the centre of an open arena (40 x 40 x 30cm), and their behaviour 
was tracked by laser photobeam breaks for 30 min. General locomotor activity was 
automatically analysed using AccuScan Fusion software (Omnitech) by counting 
the number of times mice break the laser beams (activity counts). In addition, 
rearing activity, the time spent in the centre of the arena, entries to the centre and 
distance travelled were analysed. In this study, we found that MECP2-TG mice 
are hypoactive in the open field test. In contrast, in ref. 20, MECP2-TG mice show 
a non-significant trend towards hypoactivity. This difference might be the result 
of our study assessing locomotor activity by measuring activity counts, and in 
ref. 20 by measuring the distance travelled, which is calculated by the software 
from the activity counts. 
Elevated plus maze test. After habituation in the test room (700 lx, 60 dB white 
noise), mice were placed in the centre part of the maze facing one of the two open 
arms. Mouse behaviour was video-tracked for 10 min, and the time mice spent in 
the open arms and the entries to the open arms, as well as the distance travelled in 
the open arms, were recorded and analysed using ANY-maze system (Stoelting). 
Accelerating rotarod test. After habituation in the test room (700 1x, 60 dB white 
noise), motor coordination was measured using an accelerating rotarod apparatus 
(Ugo Basile). Mice were tested for two consecutive days, four trials each, with an 
interval of 60 min between trials to rest. Each trial lasted for a maximum of 10 min, 
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and the rod accelerated from 4 to 40 1.p.m. in the first 5 min. The time that it took 
for each mouse to fall from the rod (latency to fall) was recorded. 
Three-chamber test. The three-chamber apparatus consists of a clear Plexiglas box 
(24.75 x 16.75 x 8.75) with removable partitions that separate the box into three 
chambers. In both the left and right chambers a cylindrical wire cup was placed 
with the open side down. Age- and gender-matched C57Bl/6 mice were used as 
novel partners. Two days before the test, the novel partner mice were habituated 
to the wire cups (3 inches diameter by 4 inches in height) for 1h per day. After 
habituation in the test room (700 1x, 60 dB white noise), mice were placed in the 
central chamber and allowed to explore the three chambers for 10 min (habituation 
phase). Next, a novel partner mouse was placed into a wire cup in either the left or 
the right chamber. An inanimate object was placed as control in the wire cup of the 
opposite chamber. The location of the novel mouse was randomized between left 
and right chambers across subjects to control for side preference. The mouse tested 
was allowed to explore again for an additional 10 min. The time spent investigating 
the novel partner (defined by rearing, sniffing or pawing at the wire cup) and the 
time spent investigating the inanimate object were measured manually. 
RNA-seq. Mice were euthanized under anaesthesia and the hippocampi were 
quickly dissected over ice. Total RNA of 30 hippocampal samples (three biological 
replicates of each genotype from three experiments) was extracted using miRNeasy 
minikit (Qiagen), following the manufacturer’s instructions. Isolated RNA was 
eluted in RNase-free water and submitted to the Genomic and RNA Profiling 
Core at Baylor College of Medicine. Sample quality checks using the NanoDrop 
spectrophotometer and Agilent Bioanalyzer 2100 were conducted. Then Illumina 
TruSeq RNA library preparation protocol was used as follows. A double-stranded 
DNA library was created using 250 ng of total RNA (measured by picogreen), 
preparing the fragments for hybridization onto a flow-cell. First, CDNA was cre- 
ated using the fragmented 3’ poly(A) selected portion of total RNA and random 
primers. During second-strand synthesis, dTTP is replaced with dUTP, which 
quenches the second strand during amplification, thereby achieving strand spec- 
ificity. Libraries were created from the cDNA by first blunt-ending the fragments, 
attaching an adenosine to the 3’-end and finally ligating unique adapters to the 
ends. The ligated products were then amplified using 15 cycles of PCR. The result- 
ing libraries were quantified using the NanoDrop spectrophotometer and fragment 
size assessed with the Agilent Bioanalyzer. A qPCR quantification was performed 
on the libraries to determine the concentration of adaptor ligated fragments using 
Applied Biosystems ViiA7 Real-Time PCR System and a KAPA Library Quant 
Kit. Using the concentration from the ViiA7 qPCR machine, 21 pM of library was 
loaded onto a flow-cell and amplified by bridge amplification using the Illumina 
cBot machine. A paired-end 100 cycle run was used to sequence the flow-cell on 
a HiSeq Sequencing System. 

RNA-seq data pre-processing and analysis. For each sample, about 10 million 
100-base-pair pair-end reads were generated. Raw reads were first groomed by 
removing adapters from both the 3’- and 5’-ends before mapping to the refer- 
ence genome. Then, trimmed reads were aligned to the Mus musculus genome 
(UCSC mm10; the gene model for the mapping was obtained from http://ccb.jhu. 
edu/software/tophat/igenomes.shtml) using TopHat v2.0.9 (ref. 32) with default 
parameters (-r 200 -p 5). The mappability for all 30 samples was above 85%. To pre- 
pare the aligned sequence reads into expression level for differential gene analysis, 
we used the free Python program HTSeq*’. The htseq-count function of HTSeq 
allowed us to accumulate the number of aligned reads that fall under the exons of 
the gene (union of all the exons of the gene). These read counts are analogous to 
the expression level of the gene. Using the obtained read counts, differential gene 
analyses were carried out using the DESeq package and glm.nb function in the 
R environment. DESeq includes functions for us to normalize the read counts of 
multiple samples across several genotypes by the use of the negative binomial dis- 
tribution and a shrinkage estimator for the distribution’ variance*“. glm.nb allows 
us to fit a negative binomial regression model to test the gene changes between 
genotypes. Specifically, for data from the ASO experiment, each gene was tested to 
check whether its expression levels in wild-type and TG-ASO mice differed from 
that in transgenic mice. Similarly, for data from the genetic rescue experiment, 
expression levels in Flox and Flox;TG;Cre-TMX were tested for differences from 
the expression in Flox;TG;Cre-vehicle and Flox;TG. The statistical significance of 
the observed changes was reported by the false discovery rate, which is the P value 
adjusted for multiple testing with the Benjamini-Hochberg procedure. A gene was 
considered significantly different between genotypes if it fell under a false discovery 
rate of 10% and changed in a coherent direction. 

Sample clustering. To assess the similarity of expression patterns between sam- 
ples of different genotypes, we carried out unbiased clustering: expressions of the 
identified significantly changed genes were clustered by sample based on Euclidean 
distance on average linkage and by genes based on Euclidean distance on complete 
linkage. Heat maps were then used to plot the clustered gene expressions for visual 
inspection. The plotted expressions (Z-scores) for each gene were the expressions 


normalized at the gene level to have an average of zero and a standard deviation 
of one. 

Hippocampal slice preparation. Mice were deeply anaesthetized with isofluorane, 
followed by decapitation. The brain was removed into oxygenated and ice-cold 
cutting solution (CS) containing (in mM): 110 sucrose, 60 NaCl, 3 KCl, 1.25 
NaH2POu,, 28 NaHCOs, 0.5 CaCh, 7 MgCh and 5 glucose, and the caudal portion 
of the forebrain containing the hippocampus and entorhinal cortex was isolated 
by razor blade cuts. Transverse slices (400|1m) were prepared with a Vibratome 
(Vibratome). Cortical tissue was then removed and hippocampal slices were 
equilibrated in a mixture of 50% CS and 50% artificial cerebrospinal fluid (ACSF) 
containing (in mM): 125 NaCl, 1.25 NaH2POu,, 2.5 KCI, 25 NaHCOs, 2 CaCh, 
1 MgCl, and 15 glucose, at room temperature for 10-20 min before transfer to 
the recording chamber. 

Slice electrophysiology. All data acquisition and analyses were carried out 
blinded to the genotype and treatment. Electrophysiology was performed in 
an interface chamber (Fine Science Tools). Oxygenated ACSF (95%/5% Oo/ 
COs, 31°C) was perfused into the recording chamber at the rate of 1.5mlmin~!. 
Electrophysiological traces were digitized and stored using a Digidata 1320A and 
Clampex software (Axon Instruments). fEPSPs were recorded in the stratum radia- 
tum with an ACSF-filled glass recording electrode (1-3 MQ). The relationship 
between fibre volley amplitude and fEPSP slope over various stimulus intensities 
was used to assess baseline synaptic transmission. All subsequent experimental 
stimuli were set to an intensity that evoked a 30-40% of the maximal fEPSP slope. 
Slices that did not exhibit stable fEPSP slopes during the first 20 min of recording 
were excluded from the analysis. Paired-pulse facilitation was measured at varying 
interstimulus intervals (20, 50, 100, 200 and 300 ms). LTP was induced by two 
trains of high-frequency stimulation (100 Hz for 1s) with a 20-s intertrain interval. 
Synaptic efficacy was monitored for 20 min before and 70 min after LTP induction 
by recording fEPSPs every 20s (three traces were averaged over succeeding 1-min 
intervals). For the quantification of the last 10 min of the LTP recording (Fig. 2i), 
slices were averaged per mouse, and statistical analysis was done on the animals 
(n=6-7 mice, 15-20 slices, two-tailed t-test). 

Intracerebral injection of ASO. Mice were anaesthetized with isoflurane and 
placed on a computer-guided stereotaxic instrument (Angle Two Stereotaxic 
Instrument, Leica Microsystems) that is fully integrated with the Franklin and 
Paxinos*®® mouse brain atlas through a control panel. Anaesthesia (isoflurane 3%) 
was continuously delivered via a small face mask. Ketoprofen 5mgkg~' was admin- 
istered subcutaneously at the initiation of surgery. After sterilizing the surgical site 
with betadine and 70% alcohol, a midline incision was made over the skull and a 
small hole was drilled through the skull above the right lateral ventricle. A total of 
500 1g MECP2-ASO or saline was delivered using a Hamilton syringe connected to 
a motorized nanoinjector at 0.3,.1min '. The coordinates used relative to bregma 
were: anteroposterior (AP) =—0.2 mm, medial lateral (ML) = 1 mm, dorsal ven- 
tral (DV) = —3 mm, based on a calibration study indicating these coordinates as 
leading to the right ventricle in our mice. To allow diffusion of the solution into 
the brain, the needle was left for 5 min on the site of injection. The incision was 
manually closed with suture. Carprofen-containing food pellets were provided for 
5 days after the surgery. Two weeks after the surgery,the animals were euthanized 
and their brains were dissected for RNA and protein analysis. 

Surgical implantation of cannula and osmotic pumps. Two days before surgery, 
a micro-osmotic pump (Alzet model 1004, Durect) was filled with 500 jug MECP2- 
ASO or control-ASO dissolved in 10011 saline. The pump was then connected 
through a plastic catheter to a cannula (Alzet Brain Infusion Kit 3, Durect) (see 
Extended Data Fig. 6a). The pump was designed to deliver the drug at a rate of 
0.11 lh for 28 days. The cannula plus pump assembly was primed in sterile 
saline for 2 days at 37°C. Mice were anaesthetized with isoflurane and placed on 
a computer-guided stereotaxic instrument (Angle Two Stereotaxic Instrument, 
Leica Microsystems). Anaesthesia (isoflurane 3%) was continuously maintained 
via a small face mask. Ketoprofen 5 mg kg”! was administered subcutaneously at 
the initiation of the surgery. After sterilizing the surgical site with betadine and 70% 
alcohol, a midline incision was made over the skull and a subcutaneous pocket was 
generated on the back of the animal. Next, the pump was inserted into the pocket 
and the cannula was stereotactically implanted to deliver the drug in the right ven- 
tricle using the following coordinates: AP = —0.2mm, ML= 1mm, DV=—3mm. 
The incision was sutured shut. Carprofen-containing food pellets were provided for 
5 days after the surgery. The pump was disconnected and removed 28 days after the 
initiation of treatment. Two additional weeks were given to the animals to recover 
before any behavioural testing. 

EEG monitoring. Mice were anaesthetized with isoflurane and mounted in a ster- 
eotaxic frame. Under aseptic conditions, each mouse was surgically implanted with 
three recording electrodes (Teflon-coated silver wire, 125 1m in diameter) aimed at 
the subdural space of left frontal cortex, left parietal cortex and right parietal cortex. 
The reference electrode was then positioned in the occipital region of the skull. 
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All electrode wires were attached to a miniature connector (Harwin Connector). 
After 3-5 days of post-surgical recovery, cortical EEG activity (filtered between 
0.5 and 5 kHz, sampled at 2 kHz) and behaviour were simultaneously recorded in 
freely moving mice for 2h per day over 3-5 days*®. 

EEG data analysis. All the EEG recordings were qualitatively and manu- 
ally analysed by experimenters blinded to the mouse genotype and treatment. 
Electrographic seizure activities were visually identified and matched with the 
behavioural seizure, if applicable. Other abnormal epileptiform spikes were also 
identified visually*”. 

Lymphoblasts culture and transfection with ASOs. Following informed con- 
sent, approved by the Institutional Review Board for Human Subject Research at 
Baylor College of Medicine (H-18122), a venous blood sample was provided by 
five individuals affected with MECP2 duplication syndrome and five age-matched 
controls to establish immortalized B-lymphoblastoid cell lines, following standard 
procedures. Human B-lymphoblastoid cells were cultured in suspension in RPMI 
1640 medium with L-glutamine, penicillin-streptomycin and 10% (v/v) FBS. A day 
before transfection, cells were seeded in 6-wells plates at a density of 1 x 10° cells 
in a total volume of 2 ml complete medium. Transfection mixture was prepared 
by combining 2011 ASO (at the desire concentration), 41] transfection reagent 
(TurboFect, R0531, Thermo Scientific) and 180,11 serum-free RPMI medium. The 
mix was incubated at room temperature for 15 min before adding to the cells. RNA 
was extracted from lymphoblasts 48 h after transfection. Lymphoblastoid cells from 
the age-matched control donors and the non-treated MECP2 duplication cells were 
incubated with 4.8 ,.M control-ASO. 
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Statistical analysis. Statistical significance was analysed using GraphPad Prism. 
The number of animals used (n), and the specific statistical tests used are indicated 
for each experiment in the figure legends. Sample size in behavioural studies was 
based on previous reports using transgenic mice with the same background. Mice 
were randomly assigned to vehicle or treatment groups using Excel software to 
generate a table of random numbers, and the experimenter was always blinded 
to the treatment. For behavioural assays, all population values appear normally 
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Extended Data Figure 1 | Mice that overexpress a human MECP2 
transgene over a floxed Mecp2 allele resemble the classic MECP2-TG 
mice at the molecular level. a, Breeding strategy to generate mice 

that have a wild-type MECP2 allele and a Mecp2 allele flanked by loxP 
sequences. b, Western blot of MeCP2 in hypothalamus, amygdala and 
cerebellum. GAPDH was used as the internal control (for gel source data, 
see Supplementary Fig. 3). c, Densitometric analysis of western blots in b. 
Flox;TG mice overexpress MeCP2 at levels similar to transgenic mice. It is 
noteworthy that in the hypothalamus, Flox mice were 30% hypomorphic 
(n=2 mice per group; two-tailed t-test) when compared to wild-type 
mice, but not in the other regions. d, RT-qPCR analysis in hypothalamus, 
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amygdala and cerebellum, using primers common to mouse and human 
MECP2. Flox;TG mice overexpressed the MECP2 transcript at levels 
similar to transgenic mice. Hprt1 was used as the internal control 

(n=5 mice per group; two-tailed t-test). e, RT-qPCR analysis of three 
selected genes know to be altered by MeCP2 overexpression. Flox;TG 
mice overexpressed the Sst, Crf and Prl2c2 transcripts at levels similar to 
those of transgenic mice. Hprt1 was used as the internal control (n= 4 for 
wild-type group; n= 6 for transgenic group; n = 5 for Flox and Flox;TG 
groups; two-tailed t-test). Data are mean + s.e.m. *P< 0.05; **P< 0.01; 
**P<0.001. 
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Extended Data Figure 2 | Mice that overexpress a human MECP2 
transgene over a floxed Mecp2 allele show characteristic behavioural 
deficits. a, Flox;TG mice displayed hypoactivity and anxiety in the open 
field test similar to transgenic mice. b, Flox;TG mice showed heightened 
anxiety-like behaviour in the elevated plus maze test. c, Flox;TG mice 
showed enhanced motor learning in the rotarod test similar to transgenic 
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mice (n= 18 for wild-type group; n= 9 for transgenic group; n= 14 for 
Flox group; n = 12 for Flox;TG group (for all behavioural tests)). Data 
were analysed by one-way ANOVA, with the exception of the rotarod 
test that was analysed by two-way ANOVA repeated measures followed 
by Tukey HSD post hoc correction for multiple comparisons. Data are 
mean+s.e.m. *P < 0.05; **P < 0.01; ***P< 0.001. 
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Extended Data Figure 3 | Correction of gene expression after genetic 
rescue. a, b, RT-qPCR from the hypothalamus (a) and cerebellum 
(b) shows correction of altered expression of selected genes after 
normalization of MeCP2 levels in Flox;TG;Cre-TMX mice (n= 6 for Flox 
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group; n=7 for Flox;TG group; n =7 for Flox;TG;Cre-vehicle group; n =8 
for Flox;TG;Cre-TMX group; two-tailed t-test). Data are mean +s.e.m. 
*P< 0.05; **P<0.01; ***P<0.001. 
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Extended Data Figure 4 | Hippocampal basal synaptic transmission and 
short-term synaptic plasticity are normal in MECP2-TG mice. a, Long- 
term potentiation was induced by applying high-frequency stimulation to 
the Schaffer collateral axons, and field excitatory postsynaptic potentials 
(fEPSPs) were recorded at the Schaffer collateral-CA1 synapses of the 
hippocampus (stratum radiatum). b, MeCP2 overexpression did not affect 
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Schaffer collateral basal synaptic transmission, as determined by the 
correlation between the slopes of the evoked fEPSPs and the amplitudes 
of the fibre volleys. c, d, MeCP2 overexpression did not affect short-term 
synaptic plasticity, as determined by paired-pulse facilitation (n =7 for 
Flox group; n = 6 for Flox;TG group; n=7 for Flox;TG;Cre-vehicle group; 
n=7 for Flox;TG;Cre-TMX group). Data are mean +s.e.m. 
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Extended Data Figure 5 | ASO-5 reduces MECP2 mRNA and protein HSD post hoc correction for multiple comparisons; for gel source data, 
levels. a, Location of ASOs targeting sequences on the MECP2 see Supplementary Fig. 4). e, RT-qPCR analysis of MECP2 mRNA from 
pre-mRNA. Boxes represent exons, and lines denote introns. Boxes in blue _ cortical samples of mice treated with saline or the indicated ASOs, 2 weeks 
denote the translatable regions. b, Section schemata from the Paxinos after single bolus stereotaxic injection of 500 jg ASO in the right ventricle 
and Franklin® mouse brain atlas showing site of stereotactic injection. of the brain. ASO-5 was found to be the most effective. The MECP2 
c, Western blot (c) and densitometric analysis (d) of MeCP2 from cortical primers are common to the mouse and human alleles. Hprt1 was used 
samples of mice treated with saline or the indicated ASOs 2 weeks after as an internal control (n= 3, one-way ANOVA followed by Tukey HSD 
single bolus stereotactic injection of 500 j1g ASO in the right ventricle post hoc correction for multiple comparisons). Data are mean + s.e.m. 
of the brain. ASO-5 was found to be the most effective. GAPDH was *P< 0.05; **P<0.01; ***P< 0.001. 


used as an internal control (n = 3, one-way ANOVA followed by Tukey 
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Extended Data Figure 6 | Osmotic micro-pumps for gradual the catheter to show that the dye reaches the whole ventricular system, 
intracerebroventricular infusion of ASO. a, Micro-osmotic pump confirming the correct positioning of the tip of the cannula in the right 
connected to a cannula through a plastic catheter. b, The micro- ventricle. e, RT-qPCR for Aif1 (a marker of activated microglia) and Gfap 
osmotic pump was implanted in a subcutaneous pocket and the cannula (a marker of astrocytes) immediately after the end of 4 weeks of gradual 
stereotaxically positioned to deliver the ASO into the right ventricle. ASO treatment (n= 4 for wild-type group; n=5 for transgenic group; 
c, Section schemata from the Paxinos and Franklin** mouse brain atlas n=4 for TG-ASO group). Data are mean + s.e.m. 
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Extended Data Figure 7 | Behavioural phenotype 6-7 weeks after No significant difference was found between any of the groups in the time 
the initiation of ASO treatment. a, Timeline of ASO treatment and spent investigating the inanimate object. f, No preference for the cups 
behavioural tests. b, Two weeks after cessation of ASO treatment, placed in the left or right chambers was found in the habituation phase 
MECP2-TG mice showed no rescue of hypoactivity, rearing behaviour or of the 3-chamber test (n= 19 for wild-type group; n= 16 for transgenic 
anxiety parameters in the open field test. c, No rescue was evident in any group; n= 15 for TG-ASO group (for all behavioural tests)). Data were 
of the parameters of the elevated plus maze test at this early post-treatment analysed by one-way ANOVA followed by Fisher’s LSD post hoc test, 
stage. d, ASO treatment normalized performance in the rotarod test in with the exception of the rotarod test that was analysed by two-way 
MECP2-TG mice (asterisks indicate significance between MECP2-TG ANOVA repeated measures followed by Tukey HSD post hoc correction 
ASO and control-ASO groups). e, The impaired social behaviour in for multiple comparisons. Data are mean + s.e.m. *P < 0.05; **P< 0.01; 
the 3-chamber test was not normalized in the ASO-treated group. **EP< 0.001. 
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Extended Data Figure 8 | Cortical EEG recording. Representative EEG traces of all mice recorded after gradual MECP2-ASO or control-ASO treatment. 
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Extended Data Table 1 | ASO sequences 


ASO 


ASO-1 


ASO-2 


ASO-3 


ASO-4 


ASO-5 


Control-ASO 


The sequence, length, molecular mass and type of chemical modification of the different MECP2 ASOs tested and the control ASO. 


Sequence 


AACTCTCTCGGTCACGGGCG 


CACACTGACCTTTCAGGGCT 


GATCACTGGAACACAATGGT 


CGTGCCATGGAAGTCCTTCC 


GG CTCCTTTATTATC 


G CAAATACACCTTCAT 


Length (bp) 


20 


20 


20 


20 


20 


20 
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Molecular weight (Da) 


7141.88 


7100.87 


7155.87 


7116.87 


7035.77 


7045.82 


Chemistry 


MOE gapmer 


MOE gapmer 


MOE gapmer 


MOE gapmer 


MOE gapmer 


MOE gapmer 
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Extended Data Table 2 | Demographics of MECP2 duplication syndrome patients 


Subject ID Date of birth Date of collection Age at collection (years) Gender MECP2 expression fold change 
(relative to average control) 


Control-1 07.10.1987 10.15.1992 5 Male 1.26 
Control-2 08.22.1998 07.08.1999 1 Male 0.87 
Control-3 07.13.2006 02.15.2013 6.7 Male 0.62 
Control-4 11.17.2006 11.02.2012 6 Male 0.80 
Control-5 06.23.2006 06.14.2013 7 Male 1.43 
Duplicaion-1 04.11.2007 11.13.2008 15 Male 2.51 
Duplication-2 11.07.2002 12.06.2007 5 Male 2.28 
Duplication-3 10.25.2003 04.14.2008 45 Male 2.42 
Duplication4 02.02.1999 07.26.2006 7.5 Male 1.99 
Duplication-5 11.01.2002 06.17.2008 5.5 Male 2.00 


The demographics of the MECP2 duplication patients and healthy donors that provided blood samples to establish immortalized B-lymphoblastoid cell lines. The last column on the right describes the 
individual MECP2 mRNA expression levels in lymphoblasts, relative to the average of the healthy donors. 
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Therapeutic antibodies reveal Notch control of 
transdifferentiation in the adult lung 


Daniel Lafkas!, Amy Shelton!, Cecilia Chiu?, Gladys de Leon Boenig*, Yongmei Chen”, Scott S. Stawicki?, Christian Siltanen!+, 
Mike Reichelt*, Meijuan Zhou°, Xiumin Wu’, Jeffrey Eastham-Anderson*, Heather Moore®, Meron Roose-Girma’, 
Yvonne Chinn*®, Julie Q. Hang®, Seren Warming’, Jackson Egen®+, Wyne P. Lee®, Cary Austin’, Yan Ww’, Jian Payandeh’, 


John B. Lowe’ & Christian W. Siebel! 


Prevailing dogma holds that cell-cell communication through 
Notch ligands and receptors determines binary cell fate decisions 
during progenitor cell divisions, with differentiated lineages 
remaining fixed!. Mucociliary clearance”* in mammalian 
respiratory airways depends on secretory cells (club and goblet) and 
ciliated cells to produce and transport mucus. During development 
or repair, the closely related Jagged ligands (JAG1 and JAG2) induce 
Notch signalling to determine the fate of these lineages as they 
descend from a common proliferating progenitor* ®. In contrast 
to such situations in which cell fate decisions are made in rapidly 
dividing populations” ”, cells of the homeostatic adult airway 
epithelium are long-lived!!~"’, and little is known about the role of 
active Notch signalling under such conditions. To disrupt Jagged 
signalling acutely in adult mammals, here we generate antibody 
antagonists that selectively target each Jagged paralogue, and 
determine a crystal structure that explains selectivity. We show 
that acute Jagged blockade induces a rapid and near-complete 
loss of club cells, with a concomitant gain in ciliated cells, under 
homeostatic conditions without increased cell death or division. Fate 
analyses demonstrate a direct conversion of club cells to ciliated cells 
without proliferation, meeting a conservative definition of direct 
transdifferentiation. Jagged inhibition also reversed goblet cell 
metaplasia in a preclinical asthma model, providing a therapeutic 
foundation’. Our discovery that Jagged antagonism relieves a 
blockade of cell-to-cell conversion unveils unexpected plasticity, 
and establishes a model for Notch regulation of transdifferentiation. 

Using phage display to generate synthetic therapeutic antibodies 
targeting JAG1 or JAG2, we identified a selective and potent inhibi- 
tor of each ligand that also cross-reacted with the human and mouse 
orthologues: anti-JAG1 blocking antibody version 70 (anti-JAG1.b70) 
and anti-JAG2 blocking antibody version 33 (anti-JAG2.b33). 
Anti-JAG1.b70 bound to purified JAG1 with high affinity but not 
to JAG2 or the other canonical Notch ligands Delta-likel (DLL1) or 
DLL4; conversely, anti-JAG2.b33 showed high-affinity binding only 
to JAG2, although very weak binding (3,500-fold lower affinity) was 
detected to JAG1 (Fig. la and Extended Data Figs 1 and 2a). Each 
antibody potently inhibited signalling induced only by the cognate 
ligand (Fig. 1b, c and Extended Data Fig. 2c). Both antibodies inhib- 
ited signalling through NOTCH1, NOTCH2 and NOTCHS3 (data not 
shown), demonstrating that inhibition occurs irrespective of the par- 
ticular receptor. 

To understand the molecular basis of anti-JAG1 antagonism and 
selectivity, we determined a high-resolution crystal structure of the 
antibody Fab fragment bound to human JAGI. The epitope lies at 


the junction of the Delta-Serrate-Lag (DSL) and epidermal growth 
factor-like-1 (EGF1) domains (Fig. 1d, e and Extended Data Figs 2b 
and 3). Numerous epitope residues differ between the JAG orthologues. 
Notably, heavy-chain residue His35 forms an ionic bridge with JAG1 
residue Asp204 (Fig. 1d, e and Extended Data Fig. 3b, c), an apparent 
anchor that cannot be formed with the equivalent JAG2 residue Asn204. 
Sequence alignments and structural modelling provide clear explana- 
tions for anti-JAG1.b70 cross-reactivity to human and mouse JAG1 
(Extended Data Fig. 3a, c). The NOTCH1-DLL4 crystal structure’® 
(Protein Data Bank (PDB) accession 4XL1) and models of the 
NOTCH2-JAGI interaction!’ indicate that the DSL-EGF1 face 
contains residues key to NOTCH receptor binding, pointing to steric 
hindrance of receptor binding as the mechanism for antibody blocking 
(Extended Data Fig. 3d, e). 

To determine the effects of JAG inhibition in vivo, we injected mice 
with each blocking antibody over eight days (Fig. 2a). Anti-JAG1.b70 
but not anti-JAG2.b33 induced a significant decrease in club cells and 
a corresponding increase in ciliated cells, as assessed by immuno- 
fluorescent detection of club and ciliated cell markers (Fig. 2b, c and 
Extended Data Fig. 4a). The lack of an altered lung phenotype after 
JAG2 inhibition was not due to a lack of antibody activity, because 
anti-JAG2.b33, but not anti-JAG1.b70, caused a near-complete loss of 
sebaceous gland cells, providing pharmacodynamic evidence of JAG2 
antagonism (Extended Data Fig. 4b). This dominance of JAG1 versus 
JAG? fits with cell fate analyses during lung development’. Increasing 
the dosage (to 30 mgkg!) or treatment duration (to several weeks) did 
not alter the phenotypes, bolstering the conclusion that each antibody 
is selective in vivo for its cognate ligand, and indicating that systemic 
JAG1 or JAG2 blockade is well tolerated (data not shown). 

Notably, dual blockade of JAG1 plus JAG2 generated bronchiolar 
epithelium that was nearly devoid of club cells and instead almost 
completely comprised of ciliated cells, as assessed by immunoflu- 
orescence, quantification and electron microscopy (Fig. 2b-d and 
Extended Data Fig. 4a). Intranasal antibody delivery also induced 
this phenotype, consistent with a direct effect on the epithelium 
(Extended Data Fig. 4c). Although the loss of club cells appeared 
nearly complete, rare cells remained positive for the club cell marker 
CC10 (also known as SCGB1A1) at two locations: adjacent to pul- 
monary neuroendocrine cells (Extended Data Fig. 4d) and at bron- 
choalveolar duct junctions (Extended Data Fig. 8a). We speculate 
that these may be variant club cells proposed to participate in epi- 
thelial regeneration'*!°. Transcriptomic analysis confirmed that our 
JAG antagonists inhibited Notch signalling, and revealed changes 
in lineage marker expression consistent with histological analyses 
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Figure 1 | Anti-JAG1.b70 and anti-JAG2.b33 specifically block signalling 
induced by their cognate ligands. a, ELISA measuring antibody binding to 
purified protein fragments of human (h) and murine (m) canonical Notch 
ligands. b, c, Reporter assays of Notch signalling induced by JAG1 (b) 

or JAG2 (c); antibody concentrations are shown in ug ml? (25 ug ml“! 
corresponds to 170nM); values reflect reporter gene signalling relative to 
control reporter (mean + s.d., n= 12 from three independent experiments 
with four replicates in each experiment); anti-ragweed (RW), isotype control 


(Extended Data Fig. 4e). Antagonist antibodies targeting NOTCH1 
and NOTCH2 (ref. 20) revealed NOTCH2 as the dominant recep- 
tor, with a secondary role for NOTCH1 (Extended Data Fig. 5a), 
consistent with studies during development”. Adding to our analyses 
of ligand and receptor expression (Extended Data Fig. 5b-e), these 
results demonstrate that active Notch signalling is required for club 
cell maintenance, with JAG1 on ciliated cells and NOTCH2 on club 
cells as the dominant pairing. 

Notably, we detected obvious club cell loss as early as four days 
after antibody dosing, with the phenotypic switch complete by day 6 
(Fig. 2e), significantly accelerated relative to homeostatic turnover 
of 6-12 months''. We considered that JAG blockade might induce 
cell death, followed by progenitor proliferation and differentiation. 
However, we detected no increase in apoptosis or proliferation in 
bronchiolar epithelial cells, as assessed by staining for cleaved caspase 
(data not shown) or KI67 (Fig. 2f and Extended Data Fig. 6a). We 
also included BrdU throughout antibody dosing to label any cell that 
proliferated during epithelial conversion (Fig. 2g). While we observed 
a slight increase in the percentage of labelled cells after JAG blockade, 
we did not detect proliferation in club cells (see below), and this per- 
centage (2.2%) is insufficient to consider proliferation as a driving 
force behind the near-complete cell fate switch (Fig. 2h and Extended 
Data Fig. 6b). 

This rapid-onset phenotype in the absence of notable proliferation 
raised the provocative possibility that club cells were directly convert- 
ing into ciliated cells. Consistent with this hypothesis, JAG blockade 
induced the appearance of ‘intermediate’ cells that expressed markers 
of both lineages (Extended Data Fig. 6c-f). To trace changes in club cell 
fate, we generated a Scgblal-CreERT2°%" mouse line (Extended Data 
Fig. 7a) and crossed it to the Rosa26-lsl-tdTomato reporter strain”’. 
Induced club cell labelling (Fig. 3a) was detected in 2.4% of airway 
epithelial cells, and this frequency was not altered by JAG blockade 
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(Extended Data Fig. 7b), supporting the conclusion that JAG inhibi- 
tion does not prompt club cell death or proliferation. In control epi- 
thelium, the vast majority of tdTomato-positive cells expressed CC10 
(97.80% + 2.20% (mean + s.d.)), whereas only a small percentage 
expressed the ciliated cell marker FOXJ1 (5.00% + 1.70%) (Fig. 3b, c), 
consistent with rare intermediate cells in control lung. In obvious con- 
trast, six days of JAG or NOTCH inhibition (Fig. 3b, c and Extended 
Data Fig. 7c, d) reversed the fate distribution, such that the vast major- 
ity of td Tomato-positive cells now expressed ciliated cell markers 
acetylated-a-tubulin or FOXJ1 (85.5% + 2.8%, Fig. 3b, c), whereas 
only 11.7% +2.9% of them expressed CC10 (Fig. 3c). The newly gen- 
erated ciliated cells were functional, as determined by examining cilia 
motility from live tdTomato-positive cells isolated after JAG blockade 
(Supplementary Videos 1 and 2). JAG blockade did not increase BrdU 
incorporation in traced cells above the low level (<0.02%) found in 
controls (Fig. 3d), demonstrating that JAG inhibition does not increase 
division of the relevant cell population. Our lineage-tracing experiments 
thus establish that JAG inhibition induces a direct transdifferentiation 
of one cell type to another, without cell division. 

Thirteen weeks after a single antibody dose, the distal airways had 
largely recovered, although the more proximal airways remained cil- 
iated (Extended Data Fig. 8a). This slow rate cannot be explained by 
antibody perdurance, because measurements of antibody half-life 
(approximately one week) informed us that antibody serum levels fell 
below the efficacious threshold after three weeks (data not shown). 
After a 13-week chase under our CC10 lineage-tracing conditions 
(Fig. 3a), and as expected for a slowly regenerating organ, a signifi- 
cant fraction of traced cells remained as single cells, both after con- 
trol (79.50% + 3.25%) and anti-JAG (84.40% + 5.28%) treatments 
(Fig. 3e). In controls, the vast majority of these single cells were club 
cells (92.19% + 4.48%), whereas 91.54% + 7.10% of such cells remained 
ciliated after JAG blockade (Fig. 3f). In the control group, expanded 
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Figure 2 | JAG blockade induces rapid near-complete club cell loss 

and ciliated cell gain without proliferation. a, Schematic of antibody 
treatment schedule. b, Immunofluorescent staining for the club cell 
marker CC10 (green) and the ciliated cell marker acetylated-a-tubulin 
(acet-a-Tub; red) in distal airway epithelium. Nuclei counterstained with 
DAPI. c, Quantification of club and ciliated cell numbers as assessed 

by immunofluorescent staining for CC10 and FOXJ1, respectively 
(percentages of each cell type over the total number of airway cells 
counted, n = 5; unpaired t-test, **P < 0.01, ***P < 0.001). d, Scanning 
electron microscopy, revealing an epithelium rich in club cells (arrows) 
after control treatment (left), in contrast to an epithelium made exclusively 


clones were small (two to four cells), but after JAG blockade the clones 
were larger (average of seven cells) (Fig. 3e). These clones localized at 
bronchoalveolar duct junctions, where rare club cells remained after 
JAG blockade, and contained both club and ciliated cells (Extended 
Data Fig. 8b-d). We propose that the transdifferentiated ciliated cells 
do not ‘reconvert’ to club cells but instead are replaced during normal 
epithelial turnover with directionality, from the bronchoalveolar duct 
junctions to the larger airways. 

Excess mucus secretion in airways is a unifying complication of 
several diseases, including asthma, idiopathic pulmonary disease 
and chronic obstructive pulmonary disease. Thus, we tested whether 
NOTCH blockade could convert mucus-secreting to mucus-clearing 
cells in an oft-studied pre-clinical model of goblet cell metaplasia”’. 
Sensitized mice were challenged with inhaled ovalbumin to induce 
inflammation, which stimulates club cells to differentiate into goblet 
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of ciliated cells (arrowheads) after JAG blockade (right). e, Kinetics of cell 
fate changes. Mice were treated with a single antibody dose and analysed 
on the indicated days by immunofluorescence for club cells or ciliated cells 
as in b (n=3 mice per time point). f, Quantification of the percentage of 
airway cells stained for the proliferation marker KI67 (white) as assessed by 
immunofluorescence (n =5 mice per group; unpaired t-test, **P < 0.01). 

g, Mice were dosed twice with antibodies over six days with 1 mg ml! 

BrdU in the drinking water to assess cell division. h, Quantification of the 
percentage of airway cells positive for BrdU immunofluorescence staining 
(n=5 mice per group; unpaired t-test, *P < 0.05). Data are mean + s.d. Scale 
bars, 101m (b), 541m (d) and 20m (e). 


cells”*4, the source of aberrant mucus secretion. Ina prevention study, 
with antibody delivered 24h after the first ovalbumin inhalation, we 
found that blocking a JAG1-NOTCH2 signalling axis reduced gob- 
let cell metaplasia to that of non-sensitized animals (Extended Data 
Fig. 9, controls in Fig. 4b, c). Notably, we found that anti-JAG1.b70— 
either alone or in combination with anti-JAG2.b33—also effectively 
reversed goblet cell metaplasia that had been fully established before 
antibody dosing (Fig. 4). Neither ligand nor receptor inhibition 
effected the inflammation severity score, our standard method for 
assessing the immune response (Fig. 4h and Extended Data Fig. 9d, g). 
Likewise, JAG blockade did not significantly reduce the number 
of lung eosinophils, the main drivers of lung inflammation in this 
model?>° (Extended Data Fig. 10a, b), nor did it alter cytokines in 
a manner that could explain a possible anti-inflammatory effect of 
antibody treatment (Extended Data Fig. 10c, d). These results point to 
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Figure 3 | Lineage tracing demonstrates that club cells transdifferentiate 
into ciliated cells, which do not revert back after antibody washout. 

a, Schematic of Tamoxifen induction and antibody treatment of Scgb1al- 
CreERT2°%¥/Rosa26-Isl-tdTomato mice. b, Immunofluorescence staining 
of lineage-traced cells for CC10 (green), acetylated-c-tubulin (red) and 

the lineage-tracing marker tdTomato (pink). Boxed areas are enlarged in 
the bottom row, with tdTomato-positive cells outlined but not coloured 

(n=5 mice per group). Nuclei counterstained with DAPI. Scale bars, 10 um. 
c, Quantification of the percentage of lineage-traced cells (tdTomato*) 
expressing club (CC10) or ciliated (FOXJ1) cell markers (n =5 mice per 
group, mean + s.d.; unpaired t-test, ***P < 0.001). d, Quantification of the 
percentage of proliferating lineage-traced cells (tdTomato*/BrdU*) in the 
lung epithelium (m=5 mice per group, mean + s.d.; unpaired t-test, NS, 

not significant). e, Analysis of clonal expansion of lineage traced club cells 
during the recovery period, 13 weeks after a single antibody dose. f, Cell fate 
analysis of single traced club cells from e, based on immunofluorescence 
staining for CC10 (club cells) and FOXJ1 (ciliated cells). 


an epithelial-cell-specific mechanism, and suggest that prevention and 
reversal of goblet cell metaplasia reflect direct effects of JAG inhibition 
on lung cell fate. 
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Figure 4 | JAG blockade reverses goblet cell metaplasia in vivo. 

a, Mice were sensitized during a 34-day period after intraperitoneal (i.p.) 
injection of ovalbumin (OVA) or vehicle (non-sensitized, b) and then 
challenged with inhaled ovalbumin to induce inflammation and goblet cell 
metaplasia. D, day. c-f, Alcian blue/periodic acid-Schiff (PAS) staining for 
mucin from lung sections of sensitized mice treated with control antibody 
at the first (day 42) (c) or second (day 48) (d) dose of control antibody, or at 
the second dose of anti-JAG1.b70 (e) or anti-JAG1.b70 plus anti-JAG2.b33 
(f). Scale bars, 20 um. g, Quantification of goblet cell area (n=7 mice per 
group, mean + s.d., unpaired t-test, ***P < 0.001). h, Inflammation index as 
assessed by haematoxylin and eosin staining. 


By acutely blocking the Notch pathway, we have uncovered an inher- 
ent requirement for Notch activity in club cells to maintain their cell 
fate in the adult homeostatic lung. Pharmacological inhibition of Jagged 
ligands delineates a previously uncharacterized relationship between 
club and ciliated cells, with Notch signalling induced in the former by 
Jagged expression on the latter. Disrupting this interaction unveiled a 
remarkable plasticity of adult club cells, which transdifferentiated into 
ciliated cells within a few days without cell division. At the same time, 
previous studies of the club-ciliated cell relationship under prolifera- 
tive conditions perhaps portended our findings*”~”’. Club cells have 
been referred to as a ‘not undifferentiated’ cell type, acknowledging 
the paradoxical combination of specialized function that is a hallmark 
of a terminally differentiated cell together with an ability to generate 
other lineages through division and differentiation”. Recent genetic 
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studies suggest that other cell types may also regulate this cell fate in 
the trachea, which includes a basal layer proposed to be the key source 
of Jagged required for Notch activation in the differentiated club cells 
and proliferating progenitors*”. 

Our antibodies enable pharmacological conversion of a diseased air- 
way from an epithelium aberrantly producing mucus to one primarily 
but temporarily functioning to clear mucus, with hope for a therapeu- 
tic window in the absence of toxicities associated with pan- NOTCH 
inhibition. This ability to quickly modulate cell fate thus holds promise 
for a new type of therapy. 


Online Content Methods, along with any additional Extended Data display items and 
Source Data, are available in the online version of the paper; references unique to 
these sections appear only in the online paper. 
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METHODS 

Phage library sorting and screening. Recombinant human JAGI1 (extracellu- 
lar domain (ECD)) and human and mouse JAG2 (DSL-EGF 1-4) were used as 
antigens for library sorting. Nunc 96-well Maxisorp immunoplates were coated 
overnight at 4°C with target antigen (10j.gml') and blocked for 1h at room tem- 
perature with PBST buffer (PBS with 1% (w/v) bovine serum albumin (BSA) and 
0.05% (v/v) Tween-20). Either VH antibody phage libraries*! or VH/VL libraries*” 
were added to antigen plates and incubated overnight at room temperature. Plates 
were washed ten times with PBT (PBS with 0.05% Tween-20), and bound phage 
were eluted with 50 mM HCI plus 500 mM NaCl for 30 min, and neutralized with 
an equal volume of 1 M Tris base, pH 7.5. Recovered phages were amplified in 
Escherichia coli XL-1 Blue cells. During subsequent selection rounds, phage-antigen 
incubation times were reduced to 2-3h, while the stringency of plate washing was 
gradually increased. After four rounds, 96 clones were picked from each of the VH 
and VH/VL library sorts and analysed for JAG1 and JAG2 binding. The phage 
supernatant was diluted 1:5 in ELISA buffer (PBS with 0.5% BSA, 0.05% Tween-20) 
in 100 il total volume and transferred to plates coated with target protein (1jg ml"! 
directly coated overnight). After 1h of gentle shaking to allow phage binding, the 
plate (room temperature) was washed ten times with PBST, incubated for 30 min 
with horseradish peroxidase (HRP)-conjugated anti-M13 antibody in ELISA buffer 
(1:5,000), washed ten times with PBST, and incubated for 5 min with 5011 each 
of 3,3’,5,5/-tetramethylbenzidine (TMB) peroxidase substrate and peroxidase 
solution B (H2O>) (KirkegaardPerry Laboratories). Reactions were stopped with 
10011 0.1 M phosphoric acid (H3PO4) and absorbance was determined at 450 nm. 
The reduction in absorbance (%) was calculated by the following equation: 
A450 nm reduction (%) = [(A450 nm of wells with competitor)/(A450 nm of wells 
without competitor)] x 100. Clones that bound specifically, defined as yielding 
an A450 nm at least fivefold higher for JAG1 or JAG2 binding over background, 
were selected and JAG1-JAG2 cross-binders were eliminated. Unique V;, and Vy 
sequences were cloned into the LPG3 and LPG4 vectors, respectively, to generate 
full-length human IgG1 constructs, which were expressed in mammalian CHO 
cells and purified with a protein A column. 

Affinity improvement. For each clone that showed promising cell-based assay 
activity, phagemid containing four stop codons (TAA) in complementarity deter- 
mining region loop 3 (CDR L3) and displaying monovalent Fab on the surface of 
M13 bacteriophage was generated. These phagemids served as the templates for 
Kunkel mutagenesis for the construction of affinity maturation libraries. For affin- 
ity maturation, soft randomization strategy was used, where mutagenic DNA was 
synthesized with 70-10-10-10 mixtures of bases favouring the wild-type nucle- 
otides to obtain the mutation rate of approximately 50% at the selected positions. 
Four different combinations of CDR loops, H1/L3, H2/L3, H3/L3 and L1/L2/L3 
were selected for randomization. 

For affinity improvement selection, human JAG1 or JAG2 was first biotinylated 
under limiting reagent condition. Phage libraries were subjected to six rounds of 
solution sorting with increasing stringency. For the first round of solution sorting, 
three absorbance units per ml in 1% BSA and 0.05% Tween-20 of phage input were 
incubated to plates pre-coated with antigens for 3h. The wells were washed with 
PBT ten times. Bound phage was eluted with 15011 per well of 50 mM HCl and 
500 mM KC] for 30 min, and subsequently neutralized by 5011 per well of 1 M Tris, 
pH 8.0, titred, and propagated for the next round. For subsequent rounds, pan- 
ning of the phage libraries was done in solution phase, in which phage library was 
incubated with initial concentration of 200 nM for biotinylated JAG1 and 20nM 
for biotinylated JAG2 protein (the concentration is based on parental clone equi- 
librium dissociation constant) in 100,11 buffer containing 1% Superblock (Pierce 
Biotechnology) and 0.05% Tween-20 for 2h at room temperature. The mixture was 
further diluted ten times with 1% Superblock, and 100 11 per well was applied to 
neutravidin-coated wells (10j1g ml!) for 30 min at room temperature with gentle 
shaking. To determine the background binding, control wells containing phage 
were captured on neutravidin-coated plates. Bound phage was then washed, eluted 
and propagated as described for first round. Five more rounds of solution sorting 
were carried out together with increasing selection stringency. The first couple 
rounds of which is for on-rate selection by decreasing biotinylated target protein 
concentration from 200 nM or 20 nM to 0.5 nM, and the last two rounds of which 
is for off-rate selection by adding excess amounts of non-biotinylated target pro- 
tein (300-1,000-fold more) to compete off weaker binders at room temperature. 
Affinity screening ELISA (single spot competition). Colonies were picked from 
the sixth round of screening and were grown overnight at 37 °C in 1.50 ml per well 
of 2YT media with 50j.gml carbenicillin and 1 x 10!° per ml M13KO7 in 96-well 
plate (Falcon). From the same plate, a colony of XL-1-infected parental phage was 
picked as control. 96-well Nunc Maxisorp plates were coated with 100,11 per well 
of either JAG1 or JAG2 (0.5,.g ml!) in PBS at 4°C overnight. The plates were 
blocked with 150,11 of 1% BSA in PBST for 1h. 


The phage supernatant (3511) was diluted with 7511 ELISA buffer with or 
without 25nM JAGI or 5nM JAG2, and incubated for 1 h at room temperature 
in an F plate (NUNC). The mixture (9511) was transferred side by side to the 
antigen-coated plates. The plate was gently shaken for 15 min and washed ten 
times with PBT. The binding was quantified by adding HRP-conjugated anti-M13 
antibody in ELISA buffer (1:2,500) and incubated for 30 min at room temperature. 
The plates were washed with PBST ten times. Next, 100,11 per well of peroxidase 
substrate was added to the well and incubated for 5 min at room temperature. The 
reaction was stopped by adding 10011 0.1 M phosphoric acid (H3PO,) to each well 
and allowed to incubate for 5 min at room temperature. The absorbance of the 
yellow colour in each well was determined using a standard ELISA plate reader at 
450 nm. In comparison to the A450 nm reduction (%) of the well of parental phage 
(100%), clones that had the A450 nm reduction (%) lower than 50% were picked for 
sequence analysis. Unique clones were selected for phage preparation to determine 
binding affinity (phage ICs») against target antigen by comparison to parental 
clone. Clone that showed most affinity improvement were reformatted into human 
IgG1 for antibody production and further BIAcore binding kinetic analysis and 
other in vitro or in vivo assays. 

Antibody binding assays. JAG1 and JAG2 antibodies were tested for binding to 
recombinant purified Notch ligands human JAGI1 (hJAG1), mouse JAG1 (mJAG1), 
human JAG2 (hJAG2), murine JAG2 (mJAG2), human DLL1 (hDLL1), mouse 
DLL1 (mDLL1), human DLL4 (hDLL4) and mouse DLL4 (mDLL4) using a stand- 
ard ELISA. Notch ligand protein (1 jug ml!) in PBS, pH 7.4, was coated on ELISA 
plates (Nunc Maxisorp) at 4°C overnight. Plates were blocked with casein blocker 
in PBS (Pierce) for 1h at room temperature. Serial threefold dilutions of antibody 
IgGs in PBST buffer were added to the plates and incubated for 1 h at room tem- 
perature. The plates were then washed with PBST and bound antibodies were 
detected with peroxidase-conjugated goat anti-human Fab specific IgG (Sigma). 
TMB substrate (3,3’,5,5’-tetramethylbenzidine) was used and the reactions were 
stopped with 100,11 0.1M phosphoric acid (H3PO,) before absorbance at 450nM 
was read using a standard ELISA plate reader. Absorbance was plotted against 
concentrations of IgGs using Prism 6 (GraphPad Software). 

Antibody binding affinities. Antibody binding affinities and rate constants were 
measured by surface plasmon resonance using a BIAcore-T200 instrument. For 
human JAGI, human and mouse JAG2, human DLL1, human and mouse DLL4 
affinity measurements, human IgG versions of anti-JAG1.b70 and anti-JAG2.b33 
antibodies were captured by mouse anti-human Fc antibody (GE Healthcare, 
BR-1008-39) coated on CM5 biosensor chips to achieve approximately 200 response 
units (RU). For kinetics measurements, fourfold serial dilutions (480-0.117 nM) of 
ligands were injected in HBS-T buffer (0.01 M HEPES, pH 7.4, 0.15 M NaCl, 
0.05% (v/v) Surfactant P20, GE Healthcare) at 25°C with a flow rate of 30 plmin7!. 
For mouse JAG1 and DLL1, owing to their background binding to empty flow cell, 
the kinetic parameters were determined via directly coating the ligands on the CM5 
biosensor chip. Purified antibodies in fragment antigen-binding (Fab fragment) 
format were then flown through the biosensor chip. Mouse JAG1 or DLL1 was 
coated on CM5 biosensor chips to achieve approximately 100 RU. Fivefold serial 
dilutions (500-0.16 nM) of Fab fragments were then injected in HBS-T at 25°C 
with a flow rate of 30j11min~!. Association (kon) and dissociation (ogg) rates were 
calculated using a simple one-to-one Langmuir binding model (BlAcore Evaluation 
T200 Software version 2.0). The equilibrium dissociation constant (Kp) was cal- 
culated as the ratio kog/kon. For affinity analysis, Kg was calculated using a steady 
state affinity model. 

Notch reporter assays. U87 glioblastoma cells, which endogenously express pre- 
dominantly NOTCH2 but only very low levels of other NOTCH receptors, were 
co-transfected with a Notch-responsive TP-1 (12X CSL) firefly luciferase reporter 
and a constitutively expressed Renilla luciferase reporter (pRL-CMV, Promega 
E2261) to control for transfection efficiency. Antibodies were added with the 
ligand-expressing cells (NIH-3T3 cells stably transfected with human JAG1 or 
JAG2 6-8h after transfection). Luciferase activities were measured after 20h of 
co-culture (Dual Glo Luciferase, Promega E2920), using a Perkin-Elmer EnvVison 
2103 Multilabel Reader. Typically, four replicates were analysed for each condition, 
and values were expressed as relative luciferase units (firefly signal divided by the 
Renilla signal) and graphed as percentage of signalling relative to anti-ragweed 
isotype control antibody, which was set at 100%. The cell line tested negative for 
mycoplasma. 

NOTCH2 intracellular domain immunoblot analysis. NOTCH2 signalling was 
induced in U87 glioblastoma cells by incubation with JAG1-coated (R&D, 599-JG) 
beads (Bangs Laboratories, BM562) for 24h. Cells were subsequently collected 
and nuclear fractions were isolated, and 10\1g of protein was run on a 4-12% 
NuPAGE Novex Bis-Tris gel buffered with MOPS (Life technologies) for 90 min 
at a constant voltage of 200 V. Proteins were then transferred to PVDF membrane. 
The NOTCH2 intracellular domain was detected using our in-house antibody 
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(clone 40-2-7)** at a concentration of 0.2 Lg ml. The nuclear protein CREB (Cell 
Signaling 9197, clone 48H2) was detected as a loading control. 

JAGI, JAG2, DLL1 and DLL4 fragment reagent proteins. Mouse DLLI 
(M1-Q516-CHis) and human DLL1 (M1-G540-CHis) were purchased from 
Sino Biological (50522-M08H and 11635-H08H-50, respectively). Mouse DLL4 
(S28-P525-CHis) and human DLL4 (S27-P524-CHis) were purchased from R&D 
systems (1389-D4 and 1506-D4, respectively). Mouse JAG1 (M1-D387-CHis), 
human JAG1 (M1-D387-CHis), mouse JAG2 (M1-E403-CHis) and human JAG2 
(M1-E403-CHis) were cloned into a modified pAcGP67A vector and all constructs 
were confirmed by DNA sequencing. Recombinant baculovirus was generated 
using the Baculogold system (BD Biosciences) following standard protocols. The 
virus was amplified twice to prepare the stock used for protein expression. Protein 
was expressed in Sf9 cells in serum-free ESF921 (Expression Systems LLC) and cells 
were grown to 2E6 cells per ml and infected with the appropriate virus at a ratio 
of virus/culture of 5.0 ml1~! (v/v). The Wave reactor was maintained at 27°C and 
25 rp.m. with fixed angle of 9° and 0.31m~! of 30% oxygen. At 48h after infection, 
cells were pelleted by centrifugation at 5,000 g for 15 min. The culture media were 
supplemented with 50 mM Tris, pH 8, 5mM CaCl, and 1mM NiCl and proteins 
were purified over a Ni-NTA Superflow column (Qiagen), washed with 10 column 
volumes of buffer P1 (20mM Tris, pH 8, 300mM NaCl and 30 mM imidazole), 
eluted with buffer P2 (20 mM Tris, pH 8, 300 mM NaCl and 300 mM imida- 
zole), and further purified over a Superdex 75 column (GE Healthcare) in buffer 
P3 (50mM HEPES, pH 7.5, 200 mM NaCl and 5mM CaCl). Peak fractions were 
analysed by SDS-PAGE, pooled, aliquoted and frozen at —80°C. Protein sequences 
of these constructs with signal sequences and purification tags underlined were 
mouse JAG]: MLLVNQSHQGFNKEHTSKMVSAIVLY VLLAAAAHSAFAAD 
LGSQFELEILSMQNVNGELQNGNCCGGVRNPGDRKCTRDECDTYFKVC 
LKEYQSRV TAGGPCSFGSGSTPVIGGNTFNLKASRGNDRNRIVLPFSFAWP 
RSYTLLVEAWDSSNDTIQPDSITEKASHSGMINPSRQWQTLKQNTGIAHF 
EYQIRV TCDDHY YGFGCNKFCRPRDDFFGH YACDQNGNKTCMEGWM 
GPDCNKAICRQGCSPKHGSCKLPGDCRCQYGWQGLYCDKCIPHPGCV 
HGTCNEPWQCLCETNWGGQLCDKDLNYCGTHQPCLNRGTCSNTGPD 
KYQCSCPEGYSGPNCEIAEHACLSDPCHNRGSCKETSSGFECECSPGWTGP 
TCSTNIDDEEGLVPRGSGHHHHHH; human JAG1: MLLVNQSHQGENKEH 
TSKMVSAIVLY VLLAAAAHSAFAA DLGSQFELEILSMQNVNGELQNGNC 
CGGARNPGDRKCTRDECDT YFKVCLKEYQSRV TAGGPCSFGSGSTPVIG 
GNTFNLKASRGNDRNRIVLPFSFAWPRSY TLLVEAW DSSNDT VQPDSIE 
KASHSGMINPSRQWQTLKQNTGVAHFEYQIRVTCDDYY YGFGCNKFCR 
PRDDFFGHYACDQNGNKTCMEGWMGPECNRAICRQGCSPKHGSCKLP 
GDCRCQYGWQGLYCDKCIPHPGCV HGICNEPWQCLCETNWGGQLCD 
KDLNYCGTHQPCLNGGTCSNTGPDKYQCSCPEGYSGPNCEIAEHACLSD 
PCHNRGSCKETSLGFECECSPGW TGPTCSTNIDDEFGLVPRGSGHHHHHH; 
mouse JAG2: MLLVNOSHOGFNKEHTSKMVSAIVLY VLLAAAAHSAFA 
ADLGSYFELQLSALRN VNGELLSGACCDGDGRTTRAGGCGRDECDTYV 
RVCLKEYQAKVTPTGPCSYGYGATPVLGGNSFYLPPAGAAGDRARARSRT 
GGHQDPGLVVIPFQFAW PRSFTLIVEAWDWDNDTTPDEELLIERVSHAG 
MINPEDRWKSLHFSGHVAHLELQIRVRCDENY YSATCNKFCRPRNDFFG 
HYTCDQYGNKACMDGW MGKECKEAVCKQGCNLLHGGCT VPGECRCS 
YGWQGKFCDECVPYPGCVHGSCVEPWHCDCETNWGGLLCDKDLNYC 
GSHHPCVNGGTCINAEPDQYLCACPDGYLGKNCERAEHACASNPCANG 
GSCHEVPSGFECHCPSGWNGPTCALDIDEEFGLVPRGSGHHHHHH; 
human JAG2: MLLVNOSHQGENKEHTSKMVSAIVLY VLLAAAAHSAFA 
ADLGSYFELQLSALRNVNGELLSGACCDGDGRTTRAGGCGHDECDTYV 
RVCLKEYQAKVTPTGPCSYGHGATPVLGGNSFYLPPAGAAGDRARARAR 
AGGDQDPGLVVIPFQFAWPRSFTLIVEAWDWDNDTTPNEELLIERVSHA 
GMINPEDRWKSLHFSGH VAHLELQIRVRCDENY YSATCNKFCRPRNDFF 
GHYTCDQYGNKACMDGW MGKECKEAVCKQGCNLLHGGCTVPGECR 
CSYGWQGRFCDECVPYPGCVHGSCVEPWQCNCETNWGGLLCDKDLN 
YCGSHHPCTNGGTCINAEPDQYRCTCPDGYSGRNCEKAEHACTSNPCA 
NGGSCHEVPSGFECHCPSGWSGPTCALDIDEEFGLVPRGSGHHHHHH. 
Anti-JAG1 antibody expression and Fab purification. Anti-JAG1 was expressed 
in CHO cells and purified from cell-conditioned media using MabSelect SuRe resin 
(GE Healthcare). After loading, resin was washed with five column volumes of: 
buffer A (25 mM Tris, pH 7.5, 150mM NaCl and 5mM EDTA), buffer B (25 mM 
Tris, pH 7.5, 150mM NaCl, 5mM EDTA and 0.1% (v/v) Triton X-114), buffer C 
(400 mM potassium phosphate, 5 mM EDTA and 0.2% (v/v) Tween 20), and again 
with buffer A. The antibody was eluted with buffer D (50mM sodium acetate 
pH 3.0, 50mM NaCl), neutralized by adding 1.5 M Tris, pH 9.0 (to pH 7.0), and 
purified over a Superdex 200 column (GE Healthcare) in PBS, pH 7.4, 150 mM 
NaCl. Peak fractions were incubated with lysyl endopeptidase (Wako Chemicals, 
Inc.) at 37°C for 1h to generate the anti-JAG1 Fab and cleavage was stopped by 
addition of sodium acetate (pH 3.0, 250 mM final). Finally, the Fab was purified 
over an SP Sepharose Fast Flow column (GE Healthcare) using a 0-30% (w/v) NaCl 
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gradient and molecular mass was confirmed by liquid chromatography-time-of- 
flight mass spectrometry (LC-MS TOF). 
JAG1 DSL-EGF1-4 expression and purification. Human JAG1 (V187-D377; 
ECD) was cloned, expressed and purified as described above for other JAG1 and 
JAG2 proteins. After purification over a Superdex 75 column in buffer P3 (defined 
above), peak fractions were incubated with TEV protease to remove the His tag. 
Cleaved JAG1 was collected in the flow-through from a Ni-NTA column and repu- 
rified over a Superdex 75 column in the same buffer. 
Crystallization and structure determination. The JAG1-ECD-anti-JAG1.b70 Fab 
complex was prepared by adding a molar excess of JAG1 ECD and purified over a 
Superdex $200 column (GE Healthcare) in buffer P3. Peak fractions containing the 
complex were concentrated to 6mg ml”! and crystallized by sitting drop vapour 
diffusion in a 1:1 ratio with 100 mM CHES, pH 9.5, 20% PEG8000 at 19°C. Crystals 
were cryoprotected with 15% (v/v) glycerol before flash freezing in liquid nitrogen 
and maintained at 100 K during data collection. Native data were collected at the 
Stanford Synchrotron Radiation Lightsource (SSRL 12.2) using a PILATUS6M 
PAD detector. Diffraction data were integrated and scaled using autoPROC 
(Global Phasing)** to 2.56 A and, when required, further processed using the CCP4 
package*’. Available JAG1 ECD (PDB 2VJ2) and Fab (PDB 2ROL) structural mod- 
els were used after trimming CDRs to poly-Ala for the initial molecular replace- 
ment solution using the Phenix software”. Reiterative building using COOT*” 
and refinement in Buster (Global Phasing)** were used to arrive at the final model 
JAG1-anti-JAG1.b70 model. EGF4 was present in the crystallized JAG1 construct 
but not modelled, as only diffuse electron density was observed for this domain. 
Two independent JAG1-anti-JAG1.b70 complexes are present in the asymmetric 
unit of essentially identical structure. Geometry was assessed using PROCHECK”” 
and MolProbity*° and structural figures were prepared with the PyMol software"". 
Generation of Scgb1a1-CreERT2°™® mice. The construct for targeting the 
CreERT2 recombinase into the C57BL/6 Scgb1a1 locus in embryonic stem (ES) 
cells was made using a combination of recombineering and standard molecular 
cloning techniques. In brief, a cassette (CreERT2 SV40 pA, and frt-PGK-em7- 
Neo-BGHpA-frt) flanked by short homologies to the mouse Scgblal gene was 
used to modify an Scgbla1l C57BL/6J bacterial artificial chromosome (BAC) 
(RP23-234B14) by recombineering. The CreERT2 cDNA cassette was inserted at 
the endogenous ATG and the remainders of the Scgblal exon 1 plus the beginning 
of intron 1 were deleted. The targeted region in the BAC was then retrieved into 
pBlight-TK along with flanking genomic Scgb1a1 sequences as homology arms for 
ES cell targeting. Specifically, the 2939-base pair (bp) 5’ homology arm corresponds 
to NCBI37/mm9 chr19:9,162,392-9,165,330 (reverse strand) and the 2660-bp 
3’ homology arm corresponds to chr19:9,159,677-9, 162,336 (reverse strand). The 
final vector was confirmed by DNA sequencing. The vector was linearized with Not 
I, and C57BL/6N C2 ES cells were targeted using standard methods (G418-positive 
and gancyclovir-negative selection). Positive clones were identified using poly- 
merase chain reaction (PCR) and TaqMan analysis and confirmed by sequencing 
of the modified locus. Correctly targeted ES cells were transfected with an FLPe 
plasmid to remove Neo, and ES cells were then injected into blastocysts using 
standard techniques. Germline transmission was obtained after crossing result- 
ing chimaeras with C57BL/6N females. Founders were determined by long PCR 
sequencing and screening for club cell specific recombination after crossing to the 
Rosa26-lsl-tdTomato reported mouse (Jax stock 007914). Reporter expression was 
not observed in the airways of adult mice without Tamoxifen injection. 
Genotyping of Scgblal-CreERT2°% mice. For genotyping three 
primers were used, forward1-(F1): 5/-TCTCCTAAGTGGAGCGCAATC-3’, 
forward2-(F2): 5/-GCATCTGTACAGCATGAAGTGC-3/ and reverse-(R): 
5'-GACGCAATGCTTCTGAGAGTC-3’. PCR amplification yielded a 295-bp 
product for the wild-type allele (F1+R primers) and a 646-bp product for the 
knock-in allele (F2+R primers). 
Mice. Animal studies were conducted in accordance with the Guide for the Care 
and Use of Laboratory Animals, published by the National Academy Press (2006). 
Female BALB/c and C57BL/6 mice were obtained from Jackson Laboratories 
or Charles River Laboratories. Rosa26-|sl-tdTomato mice (stock 007914) were 
obtained from The Jackson Laboratory and maintained on a C57BL/6 back- 
ground. All mice were housed under specific pathogen-free (SPF) conditions and 
used at 8-12 weeks of age. Investigators performing mouse experiments were not 
blinded. The Genentech Institutional Animal Care and Use Committee (IACUC) 
approved all animal studies. Mice were injected intraperitonealy with blocking 
antibodies diluted in 20011 of PBS at the following concentrations: anti-JAG1.b70 
at 1smgkg ’, anti-JAG2.b33 at 15mgkg’, anti-NRR1 at 10 mgkg~', anti-NRR2 at 
30mgkg "|, anti-ragweed (non-targeting control antibody) at concentrations equal 
to the maximum dose of blocking antibodies. Anti-ragweed was also supplemented 
to achieve equal dose of total antibody injected in each study. 

For the lineage-tracing studies, both male and female Scgblal-CreERT. 
Rosa26-lsl-tdTomato were used. Compound transgenic mice were induced with 
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four doses of 200 mgkg! tamoxifen (T5648, Sigma) diluted in sesame seed oil and 
were subsequently allowed 1 week to recover. BrdU was added to drinking water 
supplemented with 5% sucrose at a concentration of 1 mg ml! and was renewed 
every three days. 

For intranasal administration of antibodies, each of the anti-JAG1.b70 and anti- 
JAG2.b33 were diluted at a concentration of 4mg ml ’ in saline and anti-ragweed 
control was diluted to a concentration of 8 mg ml-!. After anaesthesia with avertin, 
62.511 of antibodies was instilled over two doses, delivered as two 31.25-1] doses 
into each nostril. Mice were allowed to recover and were analysed five days later. 
Immunohistological staining. Lungs were cleared of blood by right ventricu- 
lar perfusion with saline solution containing 2 U ml heparin, inflated with 4% 
paraformaldehyde (PFA) in PBS and submerged in Zfix overnight. Individual 
lobes were separated and embedded in paraffin and subsequently sectioned at 
5,.m. Sections were boiled for 15 min in target retrieval solution (DAKO-S1700) 
in a pressure cooker resulting in rehydration and antigen retrieval. Sections were 
then permeabilized with 0.2% Triton X-100 in PBS for 45 min and blocked for 
1h with 5% FBS and 2% BSA in PBS. For BrdU staining, paraffin was removed 
from sections by treating with xylenes, and the sections were rehydrated through 
a gradient of ethanol before a 1-h incubation in 2 N HCl. After washing in PBS, 
sections were incubated with 0.05% Trypsin/EDTA for 30 min and then washed 
with PBS before immunofluorescence staining as above. Primary antibodies were 
incubated in blocking buffer (5% FBS, 2% BSA in PBS) for 2h at room temperature. 
The following primary antibodies were used: anti-CC10 (1:1,000; sc9772, Santa 
Cruz), anti-acetylated-a-tubulin (1:200; sc23950, Santa Cruz), anti-FOXJ1 (1:200; 
14-9965-82, eBioscience), anti-BrdU (1:200; 347580, BD), anti-tdTomato (1:1,000; 
600-401-379, Rockland). Secondary antibodies were incubated in blocking buffer 
for 1-2h at room temperature. The following secondary antibodies were used: 
anti-mouse Alexa Fluor 488 (1:1,000; A21202, Invitrogen), anti-rabbit Alexa Fluor 
488 (1:1,000; A-21206, Invitrogen), anti-goat Alexa Fluor 488 (1:1,000; A-11055; 
Invitrogen), anti-mouse Alexa Fluor 555 (1:1,000; A31570, Invitrogen), anti-rabbit 
Alexa Fluor 555 (1:1,000; A-31572, Invitrogen), anti-goat Alexa Fluor 555 (1:1,000; 
A-21432; Invitrogen), anti-mouse Alexa Fluor 647 (1:1,000; A31571, Invitrogen), 
anti-rabbit Alexa Fluor 647 (1:1,000; A-31573, Invitrogen), anti-goat Alexa 
Fluor 647 (1:1,000; A-21417; Invitrogen). Nuclei were stained with DAPI. Slides 
were imaged using an Olympus BX-61 upright wide field microscope equipped 
with the following objectives: 10x UPlanS APO 0.4 numerical aperture (NA), 
20x UPlanS APO 0.75 NA, 40x UPlanS APO 0.9 NA and 60 UPlan FLN 0.9 NA, 
as well as the following filters: DAPI (ex. 387/11, em. 447/60), FITC (ex. 482/35, 
em. 536/40), Cy3 (ex. 531/40, em. 593/40), Cy5 (ex. 628/40, em. 692/40). Images 
were obtained using SlideBook software (3i - Intelligent Imaging Innovations) and 
were pseudo-coloured and edited using Photoshop CS6 (Adobe). 

Immunohistochemistry staining was performed on 4-\1m thick formalin- 
fixed, paraffin-embedded tissue sections mounted on glass slides as previously 
described**. In brief, primary antibodies against JAG1, polyclonal (sc-6011, Santa 
Cruz Biotechnology), Notch-1, clone D1E11 (3608, Cell Signaling Technologies), 
Notch-2, clone D76A6 (5732s, Cell Signaling Technologies) and Hes-1, clone NM1 
(D134-3, MBL International) were used at 0.7p.gml~1, 5gml~}, 8;.gml- and 
1g ml! respectively. NOTCH1, NOTCH2 and JAGI staining was carried out 
on the Ventana Discovery XT automated platform (Ventana Medical Systems). 
Sections were treated with Cell Conditioner 1, standard time. Specifically bound 
primary antibody was detected by incubating sections in OmniMap anti-rabbit- 
HRP (Ventana Medical Systems) for NOTCH1, NOTCH2 and OmniMap anti- 
goat-HRP (Ventana Medical Systems) for JAG1 followed by ChromoMap DAB 
(Ventana Medical Systems). HES1 staining was performed on the DAKO auto- 
stainer, using Target pH6 (Dako) antigen retrieval. Detection used donkey anti- 
rat biotinylated secondary (Jackson ImmunoResearch Laboratories), followed by 
streptavidin- HRP with TSA enhancement (PerkinElmer) and DAB visualization 
(Pierce). The sections were counterstained with haematoxylin and dehydrated. 
For the Alcian blue/PAS staining, tissues were fixed 24h at ambient temperature 
in 10% neutral buffered formalin (VWR) then processed and embedded using a 
Tissue-Tek VIP processor (Sakura). Four-micrometre sections were mounted on 
Superfrost Plus glass slides (Richard-Allan) and dried for 30 min at 60°C, then 
dewaxed and rehydrated before staining. Alcian blue/PAS staining was performed 
using an Artisan automated stainer and staining reagents according to the man- 
ufacturer’s instructions (AR16911-2, DAKO). Slides were then air-dried, cleared 
with xylene, and mounted with a synthetic mounting medium (Tissue-Tek Glas, 
Sakura). 

Standard (morphology) transmission electron microscopy. For standard trans- 
mission electron microscopy the lung tissues were fixed in 1/2 Karnovsky’s fixa- 
tive (2% PFA, 2.5% glutaraldehyde in 0.1 M sodium cacodylate buffer, pH 7.2). 
The samples were post-fixed in 1% aqueous osmium tetroxide for 2h, stained 
with 0.5% uranyl acetate for 1h and then dehydrated through a series of ethanol 
(50%, 70%, 90%, 100%) followed by two propylene oxide washes. Samples were 


embedded in Eponate 12 (Ted Pella). Curing of the samples was at 65°C for 2 days. 
Semithin (300 nm) and ultrathin (80 nm) sections were obtained with an Ultracut 
microtome (Leica). The semithin sections were stained with Toluidine Blue and 
examined by bright field microscopy to identify tissue areas with terminal bron- 
chioles. Then parallel ultrathin sections were prepared, counter stained with 0.2% 
lead citrate and examined in a JEOL JEM-1400 transmission electron microscope 
at 80kV. Digital images were captured with a GATAN Ultrascan 1000 CCD camera. 
Immunogold electron microscopy. For immuno-electron micoscopy studies, 
lungs were fixed in 4% PFA in 0.1 M phosphate buffer (pH 7.2) for several days 
and stored at 4°C. Tissue samples were cut into 1-mm pieces, washed in PBS 
and quenched for 5 min in 0.15% glycine in PBS before being rinsed in water. 
Samples were then dehydrated with an ascending series of dimethylformamide 
in water followed by two 100% dimethylformamide steps; each step for 15 min 
at 4°C. The tissues were finally infiltrated with LR White resin (London Resin 
Company) and cured at 55°C for 2 days. Semithin (300 nm) and ultrathin (80 nm) 
sections were obtained with an Ultracut microtome (Leica). The semithin sec- 
tions were transferred to glass slides, stained with Toluidine Blue and examined 
by bright field microscopy to identify areas containing terminal bronchioles. Light 
microscopy images were acquired at 1,000 x using a Zeiss Axioplan microscope 
and a Zeiss Plan Apochromat objective (100 x, 1.4 NA, oil immersion). Ultrathin 
(80 nm) sections of areas containing terminal bronchioles were then transferred 
to transmission electron microscopy grids and used for immunogold labelling. 
Grids were simultaneously incubated with a (mouse) monoclonal antibody for 
acetylated-a-tubulin (Abcam, ab24610, dilution 1:25) and a (goat) polyclonal anti- 
body for CC10 (Santa Cruz, sc9772, dilution 1:100). Secondary antibodies were 
(donkey)-anti-mouse-12nm and (donkey)-anti-goat-18 nm antibody-gold conju- 
gates (Jackson ImmunoResearch) diluted at 1:20 and 1:10, respectively. Labelled 
sections were counterstained with 0.5% uranyl acetate in water for 5min at room 
temperature and examined in a JEOL JEM-1400 transmission electron microscope 
at 80kV. Digital images were captured with a GATAN Ultrascan 1000 CCD camera 
at magnifications from 500-50,000 x. Specificity of the labelling was determined 
by confirming absence of labelling in a negative control (no primary antibodies, 
but all secondary antibodies were used), absence of labelling in unrelated areas 
and confirmation of expected labelling over cilia (anti-acetylated tubulin anti- 
body) and secretory vesicles (anti-CC10 antibody), respectively, in single-labelling 
experiments. 

Scanning electron microscopy. For scanning electron microscopy, 1-mm thick 
sections through the lobes of the lungs were fixed in 1/2 Karnovsky’s fixative (see 
earlier). The samples were then post-fixed in 1% aqueous osmium tetroxide, 
stained with 1% uranyl acetate for 2h and then dehydrated through a series of 
ethanol (50%, 70%, 90%, 95%, 100%) followed by three changes in 100% hexa- 
methyldisilazane. Finally, the samples were mounted on scanning electron micros- 
copy stubs, air dried and coated with 5-nm palladium-gold using an EMS150R ES 
sputter coater (Electron Microscopy Sciences). The samples were imaged with an 
FEI XL30 ESEM in secondary electron mode at 5kV and 15mm working distance. 
Images were acquired at magnifications from 500 x to 5,000x. 

Whole-mount staining. Staining was performed as previously described”. 
Following perfusion of the lung to remove blood, lungs were inflated and fixed in 
4% PFA overnight. The right caudal was removed and placed in 1% Triton X-100 
in PBS until the tissue sank to the bottom of the tube. Whole-mount lobes were 
stained with anti-RFP (1:1,000; 600-401-379, Rockland) for 72h in 4ml blocking 
buffer (5% FBS, 2% BSA in PBS) containing 0.2% Triton X-100 at 4°C and were 
subsequently washed in a large volume of 0.2% Triton X-100 in PBS for 6-8 h. 
Whole-mount lobes were subsequently stained with anti-rabbit Alexa Fluor 555 
(1:1,000; A-31572, Invitrogen) secondary antibody overnight in 4 ml blocking 
buffer containing 0.2% Triton X-100 at 4°C and were subsequently washed in a 
large volume of 0.2% Triton X-100 in PBS for 6-8h and were subsequently stored 
in PBS until clearing. 

Optical clearing of whole-mounts. For optical clearing, lobes were dehydrated 
through a gradient of tetrahydrofurane: 50% for 30 min, 70% for 30 min, 80% for 
30 min, and three times 100% for 30 min each. Dehydrated lobes were subsequently 
incubated in dimethyl ether for 20 min followed by 30 min in dibenzy] ether to 
obtain an optically cleared sample. Imaging was performed using the LaVision 
Ultramicroscope (La Vision BioTec GmbH) composed of an Olympus MVX10 
stereomicroscope equipped with a 2 x 0.3 NA SFD-PLAPO air objective and a 
sCMOS pco.edge camera. Images were acquired using Imspecter software 
(La Vision BioTec GmbH) and were subsequently analysed with Fiji. 

RNA extraction and qPCR. For RNA extraction, half a lung was lysed and homo- 
genized in 4ml buffer RLT with 2-mercaptoethanol. Lysates (35011 each) were 
run through a Qiashredder column and RNA isolation was completed with the 
Qiagen RNeasy kit as per the manufacturer’s instructions with DNase digestion. 
Eluted RNA was resuspended in 6011 total volume of RNase-free water. CDNA was 
synthesized used the ABI High Capacity cDNA Reverse Transcription Kit, using 


© 2015 Macmillan Publishers Limited. All rights reserved 


200 ng RNA in 20 1] total reaction volume as per the manufacturer’ instructions. 
cDNAs were preamplified and prepared for quantitative Real Time PCR as per the 
Preamp Protocol: Fluidigm Specific Target Amplification. All real time PCR reac- 
tions were run on the Fluidigm platform with the following TaqMan assays: Gapdh 
(Mm99999915_g1), Hes] (Mm01342805_m1), Heyl (Mm00468865_m1), Hey2 
(Mm00469280_m1), Foxj1 (Mm01267279_m1), Scgblal (Mm00442046_m1), 
Muc5b (Mm00466391_m1) and MucS5ac (Mm01276718_m1). 

RNAscope in situ hybridization. RNA in situ hybridization was performed by 
Advanced Cell Diagnostics, Inc. for mouse Notch1 (404641-C3), Notch2 (425161- 
Cl), Jag] (412831-C1), Jag2 (417511-C1), Foxj1 (317091-C2) and Scgblal (420351- 
C3) mRNA was performed manually using RNAscope Multiplex Fluorescent 
Reagent Kit (320850) according to the manufacturer's instructions. In brief, 5-11m 
formalin-fixed, paraffin-embedded tissue sections were pre-treated with heat and 
protease before hybridization with the target oligonucleotide probes. Preamplifier, 
amplifier and alkaline-phosphatase-labelled oligonucleotides were then hybrid- 
ized sequentially, coupled with a fluorescent conjugate. Each sample was quality 
controlled for RNA integrity with an RNAscope probe specific to PolR2A/PPIB/ 
UBC RNA (320881) and for background with a probe specific to bacterial dapB 
RNA (320881). Specific RNA staining signal for each of Notch1, Notch2, Jag1, Jag2 
was identified as green, Foxj1 RNA was identified as red and Scgblal RNA was 
identified as white punctate dots. Samples were counterstained with DAPI. 
Analysis of clonal expansion of lineage traced cells. For the quantification 
of the clonal expansion of lineage traced club cells Scgblal-CreERT2°N"/ 
Rosa26-Isl-tdTomato compound transgenic mice were induced with four doses 
of 200 mg kg! tamoxifen (T5648, Sigma) diluted in sesame seed oil and were 
subsequently allowed 1 week to recover. Four mice were subsequently treated 
with a single dose of either anti-ragweed non-targeting isotype control antibody 
at 30 mgkg“! or anti-JAG1.b70 at 15 mgkg“! plus anti-JAG2.b33 at 15 mgkg"!. 
Lungs were sectioned and stained by immunofluorescence for the lineage-tracing 
marker tdTomato, CC10 to mark club cells and acetylated-c-tubulin or FOXJ1 to 
mark ciliated cells, 13 weeks after treatment. Clones were defined as single cells, 
small clusters of 2-4 cells or clusters of more than 5 cells. Quantifications were 
done by eye and were calculated as the number of one of the three types of clones 
over the total amount of clones. 

Single-cell isolation and video capture. A single-cell suspension was prepared 
from treated mouse tracheas as previously described’. In brief, tracheas were 
resected from the bronchial bifurcation to just distal to larynx, cleared from 
adherent tissue and incubated in 0.2% (w/v) pronase (Roche Applied Science) in 
Ham's F12 medium (Life Technologies) containing 1% penicillin/streptomycin 
(Life Technologies) overnight at 4°C. Imaging was performed using a 20x S Plan 
Fluor objective (0.45 NA; Nikon) on a Nikon Ti-E perfect focus inverted micro- 
scope equipped with Live Cell environmental chamber (Pathology Devices), Neo 
sCMOS camera (Andor, Oxford Instruments), and controlled by NIS-Elements 
software (Nikon). Red fluorescence and phase channels were acquired simultane- 
ously at 90 frames per second using Fast Timelapse mode in NIS Elements. Images 
are displayed in greyscale, bright white cells are expressing tdTomato. Time-lapse 
acquisitions were analysed in NIS-Elements, exported as AVI files and playback 
in real time. 

Ovalbumin-induced goblet cell metaplasia model. Seven-to-eight-week-old 
C57BL/6 mice were first sensitized by i-p. injection of 501g ovalbumin in 2 mg 
alum. Then 35 days later, mice were challenged for 7 consecutive days with 1% 
ovalbumin aerosol for 30 min in a chamber. For the prevention study, block- 
ing antibodies were administered i.p. on days 36 and 39. The concentrations of 
the antibodies were as follows: anti-ragweed control non-targeting antibody 
30mgkg !, anti-JAG1.b70 15 mgkg“!, anti-JAG2.b33 15 mgkg ', anti-NRR1 
10mgkg~', anti-NRR2 20mgkg '. Lungs were collected 24h after the last chal- 
lenge for analyses of inflammatory infiltrate and Alcian blue and PAS staining. 
For the intervention study, mice were challenged with inhaled ovalbumin on days 
42 and 45 and treated with blocking antibodies on the same days. Lungs were 
collected on day 48 for analyses of inflammatory infiltrate and Alcian blue and 
PAS staining. At both endpoints, bronchoalveolar lavage fluid was collected and 
control, anti-JAG1.b70 and anti-JAG1.b70 plus anti-JAG2.b33 were analysed for 
number of immune cell populations and cytokine concentrations by Luminex. 
Blood serum was also collected and cytokine concentrations were determined 
by Luminex. 

Histopathology analysis. Lung tissues were fixed for 24h at ambient temper- 
ature in 10% neutral buffered formalin (VWR), then processed and embedded 
using a Tissue-Tek VIP processor (Sakura). Sections (41m) were mounted on 
Superfrost Plus glass slides (Richard-Allan) and stained with haematoxylin 
and eosin using a Leica Autostainer XL or with Alcian blue and PAS using an 
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Artisan automated stainer (DAKO) according to the manufacturer's instructions. 
Slides were then air-dried, cleared with xylene, and mounted with a synthetic 
mounting medium (Tissue-Tek Glas). Inflammation severity was visually scored 
in a blinded fashion on haematoxylin-and-eosin-stained slides using a subjective 
semi-quantitative five-point scale, in which 0 = normal lung (no inflammatory 
infiltrate), 1 = minimal disease (infrequent sparsely scattered inflammatory cells), 
2= mild (light perivascular/peribronchiolar involvement), 3= moderate (many 
vessels and airways affected by substantial numbers of inflammatory cells), and 
4=severe (generalized accumulations of perivascular/peribronchiolar inflamma- 
tory cells with frequent circumferential and/or bridging infiltrates). Goblet cell area 
was quantified on Alcian blue/PAS-stained slides imaged with a Nanozoomer 2.0- 
HT automated slide scanning platform (Hamamatsu) at 200 x final magnification. 
Slide images were analysed in the Matlab software package (version R2012b by 
Mathworks) as 24-bit RGB images. Regions of interest (ROIs) corresponding to 
individual profiles of medium and large airways were defined using RGB thresh- 
olding and simple morphological and shape-based filtering. Airway ROIs were 
subject to manual curation to remove false positives corresponding to vessels and 
other non-airway regions. Epithelial area within each airway ROI was defined 
using a similar approach. RGB thresholding was used to identify Alcian blue/ 
PAS-positive epithelial area and the data was normalized to either cumulative 
airway ROI epithelial area or cumulative airway ROI perimeter. For immunoflu- 
orescence quantifications (Fig. 2c, f, h) standard morphological operations were 
used to identify airways with characteristically dense DAPI staining surrounded 
by empty areas that also contained CC10 or FOXJ1 staining. The total number 
of cells in the airway were counted and scored as positive for CC10 or FOXJ1, 
KI67 or BrdU respectively. For the quantification of Fig. 3d, tdTomato cells on 
the periphery of each airway were identified using an algorithm based on radial 
symmetry", Each tdTomato-positive cell was then scored as positive or negative 
for BrdU staining. Quantifications of double-positive cells (Fig. 3c and Extended 
Data Fig. 6e) were done by manual counting on whole slide images. The raw data 
as well as the averages and standard deviations for these quantifications can be 
found in Supplementary Table 1. 

Cytokine concentration measurement. Cytokine concentrations were measured 
by using the Biorad Bio-Plex 200 system and the Bio-Plex Manager Software v6.0. 
The Bio-Plex Pro Mouse Cytokine 9-Plex Assay (MD000000EL) and 23-plex Assay 
(M60009RDPD) were used. Concentrations of cytokines previously reported as 
relevant to goblet cell metaplasia are shown. 

Statistical analysis. For mouse studies, the phenotypes described were clear, 
marked (near-complete conversion) and reproducible. We used 5-7 mice per group 
when generating data for statistical analysis and quantification. We randomized 
animals between the groups based on sex and weight, to normalize the distribution 
of these two parameters. We conducted an unpaired t-test to ascertain statistical 
significance, using the GraphPad Prism software. No statistical method was used 
to predetermine sample size. *P < 0.05, **P< 0.01, ***P <0.001. No sample or 
animal was excluded from analyses. The investigators were not blinded for any 
of the studies. 
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b Anti-JAG1.b70 Anti-JAG2.b33 
k,, (M"s") kg (s*) K, (M) x'(RU?) k,, (M*s") ky (s") K, (M) x’(RU?) 
hJAG1 (2.29+0.02)x10° (1.8+0.1)x10* (7.9+0.4)x107° 321 (1.5+0.6)x10° 0.1+0.04 (741)x107 0.26+0.02 
mJAG1*} = (1.3+0.6)x10° (7.8+0.4)x10* (6.00+0.06)x10° 2.0+0.5 steady state measurement (7+3)x107 (1.40+0.2)x10° 
hJAG2 no binding (1.9140.06)x10° (4.02+0.07)x10* (2.10+0.06)x10° 2.7+0.8 
mJAG2 no binding (4.8+0.1)x10° (9.8+0.4)x10° (2.0+0.1)x10"° 942 
hDLL1 no binding no binding 
mDLL1* no binding no binding 
hDLL4 no binding no binding 
mDLL4 no binding no binding 


Extended Data Figure 1 | Surface plasmon resonance affinity 
measurements of anti-JAG1.b70 and anti-JAG2.b33 binding. a, Surface 
plasmon resonance (SPR) was used to determine anti-JAG1.b70 and 
anti-JAG2.b33 binding affinities to purified human (h) or mouse (m) 
JAGI and JAG2 antigens. Representative curves from one assay run with 
three technical replicates are shown. At least two additional assays have 
been performed with binding to human JAGI and JAG2 antigens, yielding 
consistent results. b, SPR binding constants. For human JAG1, human and 
mouse JAG2, human DLL1, and human and mouse DLL4, antibodies were 


coated onto a CMS biosensor chip and the ligand was subsequently added 
for binding assessment. *By contrast, because mouse JAGI and DLL1 
showed some background binding to the empty flow cell, the antigens 
were coated directly onto the CM5 biosensor chip, and purified antibodies 
in Fab fragment format were subsequently added for binding assessment; 
steady-state measurements were used for the low-affinity binding of 
anti-JAG2.b33 to mouse JAG1. Data are mean + s.d. of three technical 
replicates. See Methods for details. 
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Extended Data Figure 2 | Notch ligand proteins used to characterize 
blocking antibodies, crystallographic data, and in vitro verification of 
lack of significant cross-reactivity of anti-JAG2.b33. a, Protein reagents 
used to characterize anti-JAG1 and anti-JAG2 binding. Purified human 
and mouse JAG1, JAG2, DLL1 and DLL4 extracellular fragments (2-5 pg) 
used for antibody characterization studies were analysed by SDS-PAGE 
under non-reducing (NR) and reducing (R) conditions. b, Table of 
crystallographic data collection and refinement statistics. c, Immunoblot 
analysis of NOTCH2 intracellular domain (NICD2) in U87 glioblastoma 
cells to assess selectivity of JAG1 inhibition. JAG1 immobilized on beads 
was used to induce NOTCH2 signalling in U87 cells, which endogenously 


b 


Data collection 


huJAG1 - anti-JAG1.b70 


Space group P2, 

Cell dimensions 
a, b, c (A) 53.26, 80.98, 155.62 
a,b,g (Y) 90, 94.39, 90 


Resolution (A) 
CCin > 0.5 (A) 


80.98 — 2.56 (2.86 — 2.56) * 
2.56 


R merge 0.13 (0.57) 
I/s] 6.4 (1.8) 
Completeness (%) 96.6 (97.9) 
Redundancy 3.3 (3.3) 


Number of unique 
observations 


41,335 (11,823) 


Refinement 
Resolution (A) 71.79 — 2.56 
No. reflections 41,207 
Rwork/ R tree 22.0 / 25.9 
No. atoms* 
Protein 8746 
Water/glycerol 170 
B-factors 
Protein 41.58 
Water/glycerol 33.62 
R.m.s deviations 
Bond lengths (A) 0.007 
Bond angles (°) 1.04 


*Highest resolution shell is shown in parenthesis. 
*sugars on huJAG1 were not modeled 


Ramachandran statistics as defined by Molprobity: 
favored 94.6% (1076), allowed 4.6% (53), outlier 0.8% (9) 
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express high levels of NOTCH2, in the presence of the indicated reagents. 
NOTCH2 signalling was assessed using an antibody that preferentially 
recognizes the y-secretase-cleaved (active) form of NICD2 (ref. 33). 

Asa control, a y-secretase inhibitor (DAPT, 5 14M in DMSO) inhibited 
signalling relative to control treatment (DMSO alone), evidenced by a 
clear decrease in NICD2 levels. Likewise, anti-JAG1.b70 (25,1g ml!) 
completely blocked NOTCH2 signalling; by contrast, a high concentration 
of anti-JAG2.b33 (251g ml~ 1) did not detectably decrease NICD2 levels, 
consistent with our other data that this antibody does not inhibit JAG1 
signalling despite low-affinity binding to JAGI. Asterisk indicates a nonspecific 
band. A full scan of the blot may be found in Supplementary Fig. 1. 
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Human JAG1 TCDDYYYGFGCNKFCRPRDDFFGHYACDQNGNKTCMEGWMGPECNRAICROGCSPKHGSC 245 
Murine JAG1 TCDDHYYGFGCNKFCRPRDDFFGHYACDQNGNKTCMEGWMGPDCNKAICRQGCSPKHGSC 245 


KEKK REKKEREKKKKEKRKRKKRKE KE RKEEKKRREKERRERKEEKKK RK KKK REEKKEKEEEKK 


Human JAG1 KLPGDCRCQYGWOQGLYCDKCIPHPGCVHGICNEPWQCLCETNWGGQLCDKDLNYCGTHQP 305 
Murine JAG1 KLPGDCRCQYGWOGLYCDKCIPHPGCVHGTCNEPWOCLCETNWGGOLCDKDLNYCGTHOP 305 


KRKKKKEKKKKKKEKEKKKEKKEKKEKKKKKEKKKK KKEKKKKKEKKKKKKKEKEKKKKKKEKEKKKKKKEKK 


Human JAG1 CLNGGTCSNTGPDKYQCSCPEGYSGPNCEI 335 
Murine JAG1 CLNRGTCSNTGPDKYQCSCPEGYSGPNCEI 335 


RRR RRR KK RR KKK KKK RR RR KK ERK RE KE 


b st | 


Human JAG1 TCDDYYYGFGCNKFCRPRDDFFGHYACDQNGNKTCMEGWMGPECNRAICRQGCSPKHGSC 245 
Human JAG2 RCDENYYSATCNKFCRPRNDFFGHYTCDQYGNKACMDGWMGKECKEAVCKQGCNLLHGGC 256 
Rey RK, RKKKKKKK  RAAKK A RAK KEKE KKERRKH KEE KR EEK KKK 


Human JAG1 KLPGDCRCQYGWQGLYCDKCIPHPGCVHGICNEPWQCLCETNWGGQLCDKDLNYCGTHQP 305 
Human JAG2 TVPGECRCSYGWQGRFCDECVPYPGCVHGSCVEPWQCNCETNWGGLLCDKDLNYCGSHHP 316 


PRR RK RRR GRR RK RRR OR RRR RRR RRR KKK RK KK Kk 


Human JAG1 CLNGGTCSNTGPDKYQCSCPEGYSGPNCEI 335 
Human JAG2 CTNGGTCINAEPDQYRCTCPDGYSGRNCEK 346 
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(Fab not shown) 


Extended Data Figure 3 | Anti-JAG1.b70 human-mouse cross-reactivity | was superimposed onto our JAG-anti-JAG1-Fab coordinates. The high 
and inhibitory mechanism. a, b, Amino acid sequence alignment of structural similarity between DLL4 and JAG] is evident within the DSL 
human and mouse JAGI1 (a) and human JAG1 and JAG2 (b) showing the and EGF1 domains (root mean squared deviation (r.m.s.d.) < 1.1 A). 
DSL through EGF3 domains. Asterisk (*) indicates identical amino acids; In this view, the DLL4 C2 domain and the anti-JAG1 Fab were omitted 


colon (:) denotes conservative difference; full stop (.) denotes for clarity. e, A ~90° view relative to d, with the anti-JAG1 Fab shown in 
semi-conservative difference; and blank space () denotes non-conservative _ space-filling representation. Depicting the anti-JAG1.b70 Fab bound to 
difference. c, The anti-JAG1.b70 epitope on human JAG1 is highlighted JAG] indicates that the antibody light chain (LC; cyan) would clash with 
pink, with residues unique to mouse JAG1 highlighted in green. Similarly, Notch] (arrow), thus supporting an inhibitory mechanism based on steric 
residues differing between JAGI and JAG2 positions are highlighted occlusion of receptor binding. 


yellow. d, The NOTCH1-DLL4 crystal structure (PDB 4XLI; ref. 16) 
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Extended Data Figure 4 | JAG blocking antibodies inhibit JAG-induced 
Notch signalling in vivo. a, Immunofluorescence staining of bronchiolar 
epithelium for CC10 (green) and FOXJ1 (red) from mice dosed every 
3 days over 8 days with the indicated antibodies. Nuclei were counterstained 
with DAPI. b, Haematoxylin and eosin staining of skin from mice treated 
with control antibody, anti-JAG1.b70, anti-JAG2.b33 or anti-JAG1.b70 
plus anti-JAG2.b33. JAG2 but not JAG] inhibition induced a loss of mature 
sebocytes in sebaceous glands (arrows in control and anti-JAG1.b70 skin, 
left two panels) (n =3 for each group). c, Immunofluorescence staining of 
bronchiolar epithelium for CC10 (green) and acetylated-a-tubulin (red) 
from mice treated with intranasal administration of control or anti-JAG1.b70 
plus anti-JAG2.b33 antibodies for 5 days. Nuclei were labelled with 
DAPI (n=3 for each group). d, Immunofluorescence staining for the 
acetylated-a-tubulin (red), the neuroepithelial-cell-specific marker 
CGRP (pink) and CC10 (green) from mice treated with anti-JAG1.b70 
plus anti-JAG2.b33. Right panel shows a merged image and includes 
DAPI labelling. Club cells adjacent to neuroepithelial cells escape the 
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JAG-blockade-induced transdifferentiation. e, Expression analysis of whole 
lungs from mice treated with the indicated antibodies (n = 3 for each group). 
Genes analysed included the Notch signalling targets Hes1, Hey1 and Hey2 
as well as the club-cell-specific genes Scgb1a1, Muc5b, MucSac and the 
ciliated-cell-specific gene Foxj1. Both fold change and P values are plotted. 
Vertical and red horizontal dashed lines delineate twofold changes and 
P<0.05, respectively. Anti-JAG1.b70 significantly increased expression of 
Foxj1 and reduced expression of Scgb1a1 by at least twofold. By contrast, no 
significant changes were observed when treating with anti-JAG2.b33 alone. 
The combination of anti-JAG1.b70 plus anti-JAG2.b33 significantly reduced 
expression of all Notch target genes analysed as well as club-cell-specific 
genes while increasing expression of Foxj1. Thus, the magnitude and type of 
gene expression changes revealed inhibition of Notch signalling and cell fate 
conversion in a manner that mirrored antibody induced cell fate changes 
assessed by immunofluorescence and other methods. Scale bars, 10 1m 

(a, c), 100}1m (b), 
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Extended Data Figure 5 | Determination of the functionally relevant 
ligand-receptor pair in the adult airway. a, Immunofluorescence 

staining for CC10 (green) and acetylated-c-tubulin (red) of mice treated 
with isotype control antibody, the NOTCH1 or NOTCH2 blocking 
antibodies anti-NRRI1 or anti-NRR2, or the combination of anti-NRR1 
plus anti- NRR2 (n= 3 for each group). NOTCH2 inhibition alone 
markedly reduced club cells numbers, with a corresponding increase in 
ciliated cell numbers, whereas NOTCH1 inhibition alone had little or no 
effect; dual blockade of both receptors induced the strongest phenotype. 
Results indicate that NOTCH2 is the dominant receptor controlling this 
phenotype, with NOTCH1 perhaps playing a secondary and functionally 
redundant role. b, c, Immunohistochemical staining of airways for JAG1 
and NOTCH? after control treatment (b) or JAG blockade (c) as in 

a revealed that JAG1 protein was most prominently expressed in ciliated 
cells, although a weak signal was observed throughout the epithelium. 

The JAGI signal was strongly detected throughout the ciliated epithelium 
after JAG blockade, suggesting that the ciliated cells are the relevant signal- 
sending cell type. NOTCH2 protein staining showed a complementary 
pattern, with expression clearly localized to club cells and little or no signal 
detectable in the ciliated epithelium after JAG inhibition. d, Consistent 


pehepet a eg 


Jag2 Ciliated cells Club cells 


Anti-NRR1+anti-NRR2 


Anti-NRR2 


Anti-JAG1.b70+ 
anti-JAG2.b33 


NOTCH1 


d Control 
NOTCH1 


Notch1 Foxj1 Scgbial Notch2 Foxj1 Scgb1a1 


Notch? Ciliated cells Club cells 


Notch2 Ciliated cells Club cells 


with its secondary role relative to NOTCH2, NOTCH1 protein expression 
was very weak in the epithelial layer (top), with strong NOTCH] staining 
on blood vessels from the same tissue sections serving as a positive 
control (bottom). e, Fluorescent RNAscope in situ hybridization to detect 
mRNA expression of Jag1, Jag2, Notch1 and Notch2 in mouse bronchiolar 
epithelium. Specific detection of each of the mouse NOTCH receptor 

and ligand mRNAs is shown in green, whereas mouse Foxj1 and Scgblal 
mRNA, to mark ciliated and club cells, are shown in red and white, 
respectively. The signals using this method appear as coloured puncta. 
Samples were counterstained with DAPI to reveal nuclei. Consistent with 
their functioning as the primary ligand-receptor pair controlling cell fate, 
Jag1 and Notch2 were more highly expressed than Notch1 and Jag2, which 
was only weakly detectable (n = 4 for each probe). Jagl and Jag2 signals 
appear in the same cells as Foxj1, indicating co-expression in ciliated cells 
(arrows, two left panels). By contrast, Notch1 and Notch2 signals appear in 
the same cells as Scgb1a1, indicating co-expression in club cells (arrows, 
two right panels). These results thus confirm the immunohistochemistry 
findings and extend the expression results to include Jag2, for which a 
reliable immunohistochemistry method is lacking. Scale bars, 201m (a), 
101m (b-e). 
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Extended Data Figure 6 | Lack of proliferation and intermediate 

cells are seen during transdifferentiation. a, Representative 
immunofluorescence images of sections used for quantification shown 
in Fig. 2f, of the percentage of airway cells stained for the proliferation 
marker KI67 (white) (n =5 mice per group). b, Representative 
immunofluorescence images (BrdU, green) of sections used for the 
quantification shown in Fig. 2h, of percentages of BrdU-positive airway 
cells (n=5 mice per group). c, d, Immunofluorescence staining for 
CC10 (green), and acetylated-a-tubulin (red, c) or FOXJ1 (red, 

d) of bronchiolar epithelium from mice 5 days after treatment with 
anti-JAG1.b70 plus anti-JAG2.b33. e, Quantification of the percentage of 
intermediate cells over the number of CC10-positive cells, that appeared 
positive for both CC10 and FOXJ1 by immunofluorescence at day 5 

of control treatment or JAG blockade (n= 4, mean + s.d.; unpaired 
t-test, *P < 0.05). A significant percentage of club cells remaining after 


JAG blockade (17.0 + 6.8%) expressed both CC10 and FOXJ1, a master 
transcription factor dictating ciliated differentiation. Such CC10*/ 
FOXJ1* cells were also detected in control lungs, but at significantly 
lower percentages (0.46 + 0.09%). f, Immunogold transmitted electron 
microscopy for high-resolution detection of CC10 (18-nm gold particles, 
arrowheads) and acetylated-a-tubulin (12-nm gold particles, arrows) 

in cells from the bronchiolar epithelium. Images of control cells (left) 
consistently showed CC10 expression restricted to secretory vesicles 

of apparent club cells, adjacent to and distinct from ciliated cells that 
expressed acetylated-ca-tubulin in the basal bodies and cilia, confirming 
the specificity of immunostaining. Treatment with anti-JAG1.b70 

plus anti-JAG2.b33 yielded a fraction of cells that displayed both 
acetylated-ca-tubulin at basal bodies and CC10 in the cytoplasm (right), 
consistent with a phenotypic intermediate expected during a club-to- 
ciliated cell conversion. Scale bars, 20 1m (a, b), 10,1m (c, d) and 1m (f). 
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Extended Data Figure 7 | Lineage tracing shows that club cells including apoptosis, in transdifferentiating club cells. c, Lineage tracing 
transdifferentiate into ciliated cells. a, Targeting construct to generate as in Fig. 3b, except using FOXJ1 instead of acetylated-a-tubulin as 
Scgblal-CreERT2°™" knock-in mice. PGK-neo and HSV-tk cassettes the ciliated cell marker. Scale bar, 10 um. d, Lineage tracing results 
were used for positive and negative selection, respectively. See Methods demonstrating that inhibition of NOTCH1 plus NOTCH2 induces club- 
for details. b, Whole-mount imaging of the right caudal lobe of Scgblal- to-ciliated cell transdifferentiation. Scale bar, 10j1m. After treating mice 
CreERT2°%"/Rosa26-lsl-tdTomato mice after mice were treated as in with anti-NRRI1 and anti- NRR2 blocking antibodies, immunofluorescence 
Fig. 3a. Images shown are maximum z-projections of optical sections staining of the bronchiolar epithelium was performed as in c and Fig. 3b. 
obtained with an ultramicroscope (n = 3 for each group). Scale bar, 1 mm. After NOTCH1 plus NOTCH2 inhibition, the tdTomato-positive cells, 
The overall signal from the club cell lineage trace appears approximately marking the club cell lineage, express acetylated-c-tubulin (left) and 
equal and well distributed throughout the lobe, even after conversion FOXJ1 (right), and have thus assumed a ciliated cell identity. 


to ciliated cells (bottom), confirming the lack of any notable cell loss, 
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Extended Data Figure 8 | Club cells slowly reappear after antibody 


washout with expansion localized at the brochoalveolar duct junctions. 


a, Re-establishment of normal club and ciliated cell patterning after 
transdifferentiation. Immunofluorescence staining for CC10 (green) and 
acetylated-a-tubulin (red) of mice treated with a single dose of isotype 
control antibody or anti-JAG1.b70 plus anti-JAG2.b33. Bronchiolar 
epithelia were analysed 1, 3, 6, 10 and 13 weeks after treatment, as 
indicated, to determine the time needed for club cells to reappear (n =3 
for each time point). The first signs of recovery were evident after 6 
weeks, with increased but incomplete recovery observed after 13 weeks. 
Reappearing rows of club cells seemed to originate from brochoalveolar 
duct junctions (BAJs), resulting in a gradient of reestablishment from 
smaller to larger airways. b, Whole-mount imaging of the right caudal 
lobe of Scgblal-CreERT2°%"/Rosa26-lsl-tdTomato mice. After induction 
of lineage tracing with four doses of tamoxifen (200 mg kg~'), mice were 
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treated with a single dose of isotype control antibody (left) or anti-JAG1.b70 
plus anti-JAG2.b33 (right), 1 week after the last tamoxifen injection. Lungs 
were analysed 13 weeks after treatment. Images shown are maximum 
z-projections of optical sections obtained with an Ultramicroscope 

(n=4 for each group). In lungs from mice treated with anti-JAG1.b70 
plus anti-JAG2.b33, clusters of lineage-traced cells were observed at the 
BAJs (arrows in bottom right panel); such clusters are absent in control 
lungs. c, d, Immunofluorescence staining of bronchiolar epithelium for 
the lineage-tracing marker tdTomato (pink), CC10 (green) and FOXJ1 
(red, c) or acetylated-c-tubulin (red, d), from same mice as in b. Whereas 
mostly single cells and small clones of lineage-traced cells are found at the 
BAJs of control lungs, large clones are seen in lungs after JAG blockade, 
confirming the pattern of reestablishment of club cells after treatment (a). 
Nuclei were labelled with DAPI. Scale bars, 20,1m (a), 1 mm (b, top), 
0.5mm (b, bottom) and 101m (c, d). 
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Extended Data Figure 9 | NOTCH blockade inhibits goblet cell 
metaplasia in vivo. a, Mice were sensitized during a 35-day period after 
an i.p. injection of ovalbumin or vehicle (non-sensitized control) and then 
challenged with aerosolized ovalbumin for 7 consecutive days to induce 
inflammation and goblet cell metaplasia. Mice were also treated, at 1 and 
4 days after challenge, with isotype control antibody, anti-JAG1.b70, 
anti-JAG2.b33 or anti-JAG1.b70 plus anti-JAG2.b33, as indicated (n=6 
mice per group, mean + s.d.). b, Alcian blue/PAS staining of lung sections 
from mice treated with anti-JAG1.b70, anti-JAG2.b33 or the combination, 
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on days 36 and 39, indicated as dose 1 and 2 in a. c, Quantification 

of goblet cell area (n =6 for each group, mean + s.d.; unpaired t-test, 

***P < (0.001, **P <0.01). d, Inflammation index as assessed by 
haematoxylin and eosin staining of lung sections. e, Alcian blue/PAS 
staining of lung sections from mice treated with anti-NRR1, anti-NRR2, 

or the combination, on days 36 and 39, indicated as dose 1 and 2 

ina. f, Quantification of goblet cell area (n =6 for each group, mean + s.d.; 
unpaired t-test, ***P < 0.001). g, Inflammation index as assessed by 
haematoxylin and eosin staining of lung sections. All scale bars, 20|1m. 


© 2015 Macmillan Publishers Limited. All rights reserved 


a 
s 
& 8 
26 
a e 
Q e 
3 
3 2 
4 e 
a 7 & a > x 
oO “ \ 
VF oF atheo 
é & ds 
& Sr 
oo < Ro ¥ 
x yr ee 
& 


c 
00 Serum cytokines - day 42 
250 HE Control 
a MM Anti-JAG1.b70 
= MM Anti-JAG1.b70+anti-JAG2.b33 
‘D 200 
2 
c 
2 
150 
@ 
5 
i= 
8 
< 100 
i) 
oO 
50! 


DD PPP OPO PF ae 
oes ¥ ates VA EE YY 
vy NANG ee 


a0 Serum cytokines - day 48 
2501 BS Control 
Gi Anti-JAG1.570 
= Gi Anti-JAG1.b70+anti-JAG2.b33 
‘D 200 
= 
i<j 
2 
= «150 
s 
¢ 
8 
< 100 
[-} 
12) 
501 
Pr PP? PK VY A ES Par @ @ © © 
ewe S YYyv Ss Vv DS oh VE SY ¥ 
vy v vy 


Extended Data Figure 10 | Characterization of the immune response 
during the ovalbumin challenge. a, Total numbers of immune cell 
populations found in the bronchoalveolar lavage fluid (BALF) of mice 
from the prevention goblet cell metaplasia study (Extended Data 

Fig. 9a-d). A significant reduction in both lymphocytes and macrophages 
is observed (n =7 for each group, mean + s.d.; unpaired t-test, *P < 0.05, 
***P <().01), but not in neutrophils and eosinophils, which are the most 
relevant cells that drive the metaplasia phenotype. b, Total numbers of 


immune cell populations found in the BALF of mice from the intervention 


goblet cell metaplasia study in Fig. 4. Although dual JAG blockade as 
well as blockade of JAG1 alone reverse goblet cell metaplasia (Fig. 4), a 
significant reduction in neutrophils was observed only after dual JAG 
blockade; no changes in other cell types, including eosinophils, were 
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observed (n =7 for each group, mean + s.d.; unpaired t-test, **P<0.01). 
c, Analysis of cytokine levels in blood serum (left) and BALF (right) of 
mice from the prevention study summarized in Extended Data Fig. 9a-d. 
Both antibody treatments resulted in a significant increase in the levels 
of IL4 in the BALE, although this increase was modest and not sustained 
to the later time point (n =7 for each group; mean + s.d.; unpaired t-test, 
*P <0.05). d, Analysis of cytokine levels in blood serum (left) and BALF 
(right) of mice from the intervention study summarized in Fig. 4. Neither 
antibody treatment resulted in altered cytokine levels at this time point 
(n=7 for each group; mean + s.d., unpaired t-test). These results support 
an epithelial-cell-specific mechanism in which prevention and reversal of 
goblet cell metaplasia reflect direct effects of JAG inhibition on lung 

cell fate. 
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A mechanism for expansion of regulatory T-cell 
repertoire and its role in self-tolerance 
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Stanislav Dikiy', Beatrice E. Hoyos!, Bruno Moltedo!, Saskia Hemmers', Piper Treuting®, Christina S. Leslie®, 


Dmitriy M. Chudakov?** & Alexander Y. Rudensky! 


T-cell receptor (TCR) signalling has a key role in determining T-cell 
fate. Precursor cells expressing TCRs within a certain low-affinity 
range for complexes of self-peptide and major histocompatibility 
complex (MHC) undergo positive selection and differentiate into 
naive T cells expressing a highly diverse self-MHC-restricted TCR 
repertoire. In contrast, precursors displaying TCRs with a high 
affinity for ‘self’ are either eliminated through TCR-agonist-induced 
apoptosis (negative selection)! or restrained by regulatory T (Treg) 
cells, whose differentiation and function are controlled by the 
X-chromosome-encoded transcription factor Foxp3 (reviewed in 
ref. 2). Foxp3 is expressed in a fraction of self-reactive T cells that 
escape negative selection in response to agonist-driven TCR signals 
combined with interleukin 2 (IL-2) receptor signalling. In addition 
to Treg cells, TCR-agonist-driven selection results in the generation 
of several other specialized T-cell lineages such as natural killer 
T cells and innate mucosal-associated invariant T cells*. Although 
the latter exhibit a restricted TCR repertoire, T,.g cells display 
a highly diverse collection of TCRs*°. Here we explore in mice 
whether a specialized mechanism enables agonist-driven selection 
of Treg cells with a diverse TCR repertoire, and the importance this 
holds for self-tolerance. We show that the intronic Foxp3 enhancer 
conserved noncoding sequence 3 (CNS3) acts as an epigenetic switch 
that confers a poised state to the Foxp3 promoter in precursor cells 
to make T,¢g cell lineage commitment responsive to a broad range 
of TCR stimuli, particularly to suboptimal ones. CNS3-dependent 
expansion of the TCR repertoire enables T,., cells to control self- 
reactive T cells effectively, especially when thymic negative selection 
is genetically impaired. Our findings highlight the complementary 
roles of these two main mechanisms of self-tolerance. 

TCR signalling plays an essential role in Treg cell differentiation and 
function’~!. Previous studies have shown that a broad range of self- 
reactivity can promote Treg cell differentiation in the thymus, consistent 
with the highly diverse TCR repertoire of these cells*®. We reasoned 
that a dedicated mechanism, linked to the regulation of Foxp3 gene 
expression, might enable selection of T;eg cells with a diverse TCR rep- 
ertoire. Previously, we showed that an intronic element of the Foxp3 
gene, CNS3, increases the efficiency of Teg cell generation, raising the 
possibility that it might affect the composition of the T,eg TCR reper- 
toire. To account for the potential effects of a mixed 129/B6 genetic 
background in our previous study, we backcrossed the CNS3 knock- 
out Foxp3°N3-8?? allele onto a B6 genetic background and generated 
male Foxp34S3-8 and Foxp38? littermates carrying identical amino- 
terminal enhanced green fluorescent protein (GFP) reporters!)!. 
Consistent with our previous observation!!, we found an ~40% reduc- 
tion in Foxp3*CD4* thymocytes in CNS3-deficient mice, compared 
to CNS3-sufficient littermate controls (2.05 + 0.38% and 3.38 £0.70% 


(mean + s.d.) of CD4 single-positive (SP) thymocytes, respectively). 
The size of other thymocyte subsets was unaffected (Extended Data 
Fig. 1a, b). In contrast, peripheral T,.g cells were present at comparable 
frequencies, probably owing to homeostatic expansion”!?-!> (Extended 
Data Fig. 1a). Interestingly, loss of CNS3 had no effect on Foxp3 expres- 
sion in differentiated Tye, cells (Extended Data Fig. 1c). Our previous 
study suggested that CNS3 is epigenetically marked in precursor cells, 
raising the question of which stage of T-cell differentiation CNS3 acts 
to facilitate Teg cell development. We found that ablation of a condi- 
tional CNS3 allele in double positive (DP) or double negative (DN) 
thymocytes using Cd4°” or Lck“” drivers, respectively, resulted in sim- 
ilarly defective thymic T;eg cell generation (Extended Data Fig. 1d, e). 
To assess the requirement for CNS3 immediately preceding Foxp3 
induction, we acutely ablated CNS3 using tamoxifen-inducible Cre 
and observed decreased Foxp3 induction upon activation of naive 
CD4* T cells in the presence of TGFB and IL-2 (Extended Data 
Fig. 1f). Notably, in mature T,2g cells, CNS3 was fully dispensable for the 
maintenance of Foxp3 expression during cell division in the presence of 
pro-inflammatory cytokines (Extended Data Fig. 1g, h), and for their 
suppressor function in vivo (Extended Data Fig. 2). 

These findings raised the question of how, mechanistically, CNS3 
could selectively facilitate the initiation but not the maintenance of 
Foxp3 expression. To address this problem, we identified the stage 
of thymocyte differentiation at which the CNS3 region first acquires 
the characteristic features of a poised enhancer. We previously found 
that CNS3 is marked by lysine 4 mono-methylation of histone H3 
(H3K4mel) in DP thymocytes''. Unexpectedly, we found increased 
H3K4mel levels at CNS3 at the DN1 stage and in haematopoietic stem 
cells, comparable to the levels observed in DP thymocytes, CD4 SP thy- 
mocytes and naive CD4* and CD8* T cells (Fig. la-c and unpublished 
data). In contrast, CNS3 chromatin was not enriched for H3K4mel 
in embryonic stem cells, macrophages or dendritic cells (Fig. 1b, c). 
These results indicate that the poised state of CNS3 is established at a 
very early stage of haematopoiesis, but is lost in ‘non-T-cell’ lineages. 
As CNS3 appeared to be the earliest epigenetically modified region in 
the Foxp3 locus, it might exert its function by facilitating chromatin 
remodelling at the Foxp3 promoter. 

While deposition of the ‘active’ histone modifications H3K4me3 
and H3K27ac at the Foxp3 promoter occurred exclusively in Tyeg cells 
(Extended Data Fig. 3a, b), we found an enrichment of H3K4mel1 in 
mature CD4 SP thymocytes and naive CD4* T cells (Fig. 1d). In the 
absence of CNS3, both mature CD4 SP thymocytes and naive CD4* T 
cells showed impaired H3K4mel accumulation at the Foxp3 promoter 
(Fig. le, f), suggesting that CNS3 facilitates epigenetic remodelling 
of the Foxp3 promoter in T,2g cell precursors. Notably, differentiated 
CNS3-deficient Treg cells showed normal levels of H3K4me3 and 
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Figure 1 | CNS3 acts as an epigenetic switch for the Foxp3 promoter 
poising. a, Chromatin immunoprecipitation and quantitative PCR (ChIP- 
qPCR) of H3K4mel at the Foxp3 locus and control loci (Hspa2, Rpl30 and 
Gm5069) in B cells, DP thymocytes, naive CD4* T (T,) and Treg cells. 

b, c, H3K4mel1 at CNS3 in DN and DP thymocytes (b), and 
haematopoietic stem cells (HSC), embryonic stem (ES) cells, macrophages 
(My) and dendritic cells (DC) (c). d, H3K4me1 at the Foxp3 promoter 

in DP, immature CD4 SP (imCD4SP; Foxp3~ CD62L'°CD69"), mature 
CD4 SP (mCD4SP; Foxp3~ CD62L™CD69"°) thymocytes, and naive 

CD4° T cells. e, f, CNS3 dependent H3K4mel at the Foxp3 promoter in 
mature CD4 SP thymocytes (e) and naive CD4* T cells (f). g, h, Histone 
deacetylase inhibitor butyrate enhances H3K27ac at the Foxp3 promoter 
(g) and rescues impaired Treg differentiation of CNS3-deficient T cells 

in vitro (h). Two-tailed unpaired t-test. Error bars, mean + s.e.m.; data 
represent triplicate cultures in 1 of >2 experiments. 


H3K27ac deposition at the Foxp3 regulatory regions (Extended Data 
Fig. 3c-e), consistent with the dispensable role of CNS3 in differenti- 
ated Tyeg cells (Extended Data Fig. 1c, g, h). 

To address whether the CNS3-dependent poised state of the Foxp3 
promoter assists deposition of additional permissive marks and fur- 
ther chromatin remodelling that facilitates the initiation of Foxp3 
expression, we cultured naive CD4* T cells from male Foxp38? and 
Poxp33™* S3-8P littermates under Treg Cell differentiation conditions, 
and isolated Foxp3~ cells that had been exposed to Foxp3-inducing 
conditions but had not yet acquired Foxp3 expression. We observed a 
CNS3-dependent increase in H3K27ac at the Foxp3 promoter preced- 
ing Foxp3 expression (Extended Data Fig. 3f), consistent with the 
defect in Treg cell differentiation in the absence of CNS3 (Extended Data 
Fig. 1f). Furthermore, blocking the recruitment of bromodomain- 
containing histone acetylation readers using the inhibitor iBET sharply 
reduced Tyg cell induction efficiency in a dose-dependent manner’® 
(Extended Data Fig. 3g). Conversely, blocking histone deacetylase 
activity using butyrate increased H3K27ac at the Foxp3 promoter in 
agreement with recent reports!”"!° (Fig. 1g). Notably, provision of 
butyrate rescued impaired in vitro Treg cell differentiation associated 
with loss of CNS3 (Fig. 1h). These observations suggest that a CNS3- 
dependent poised state at the Foxp3 promoter, probably via looping, in 
precursor cells may enhance their sensitivity to Foxp3-inducing signals 
(Extended Data Fig. 3h). 

While CNS3-dependent poising of the Foxp3 promoter could facil- 
itate Foxp3 induction in a probabilistic manner, it might also enable 
lower strength TCR signals to promote Tyg cell differentiation. To 
address this possibility, we tested whether impaired Foxp3 induction 
in CNS3-deficient naive CD4* T cells could be rescued by increasing 
amounts of CD3 antibody under in vitro Treg cell differentiation con- 
ditions. We found that the relative difference in the efficiency of Foxp3 
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induction between CNS3-sufficient and -deficient CD4* T cells was 
markedly decreased in the presence of higher amounts of CD3 antibody 
(Fig. 2a and Extended Data Fig. 4a), suggesting that increased TCR 
signal strength can partially compensate for the lack of Foxp3 promoter 
poising in the absence of CNS3, and that differentiation of Tg pre- 
cursors receiving lower TCR stimulation might be disproportionally 
impeded by CNS3 deficiency. These results suggest that the mature Treg 
cells differentiated from CNS3-deficient precursors are enriched for 
TCRs at the higher end of the self-reactivity spectrum, and depleted of 
those with lower self-reactivity. 

To test this possibility we analysed Foxp3~ and Foxp3* CD4 SP thy- 
mocytes and naive CD4* T cells for the expression of orphan nuclear 
receptor Nur77 (also known as Nr4a1), the product of a prominent 
TCR target gene in Tyeg and conventional T cells that accurately reports 
the strength of TCR signalling”®”!. In agreement with the in vitro 
Foxp3 induction studies, we found that both thymic and peripheral 
CNS3-deficient Treg cells, but not Foxp3 CD4* T cells, were mark- 
edly enriched for cells expressing higher levels of Nur77 in compar- 
ison to their CNS3-sufficient counterparts. This trend was observed 
in non-competitive settings of male Foxp38? and Foxp34°N%3-8P 
littermates, as well as in competitive settings, on ablation of CNS3 
in bone marrow chimaeras and heterozygous female Foxp3®”’* and 
Foxp34 3+ littermates (Fig. 2b, Extended Data Fig. 4b-d and 
unpublished data). Notably, Nur77 levels were increased in both rest- 
ing (CD44'"°CD62L") and activated (CD44"'CD62L"") CNS3-deficient 
Treg cells (Extended Data Fig. 4e-g). In contrast, Nur77 expression 
in Foxp3~ thymocytes and peripheral T cells was unaffected in the 
absence of CNS3 (Extended Data Fig. 4d). Consistently, CNS3-deficient 
Treg cells expressed increased amounts of CTLA4, a major negative 
feedback regulator of TCR signalling, and cell proliferation marker 
Ki-67 (Extended Data Fig. 4h, i). Finally, we found that CNS3-deficient 
and -sufficient Foxp3* CD4 SP thymocytes and resting peripheral Ty cg 
cells exhibited distinct gene expression profiles, in contrast to Foxp3~ 
subsets (Extended Data Fig. 5a). Specifically, the expression of TCR- 
dependent genes and genes characteristic of activated Treg cells was 
significantly increased in CNS3-deficient Foxp3* thymocytes and 
resting Treg cells in comparison to their CNS3-sufficient counterparts 
(Fig. 2c, d and Extended Data Fig. 5b, d). In contrast, transcriptional 
profiles of CNS3-sufficient and -deficient activated Treg cells and 
Foxp3~ CD4 SP thymocytes were similar (Extended Data Fig. 5a, c, e 
and unpublished data). These results further support the notion that 
loss of CNS3 results in enrichment of thymic Treg cells with heightened 
TCR signal strength. 

To assess the auto-reactivity of CNS3-deficient versus -sufficient Treg 
cells in vivo, we examined their capacity for MHC class II (MHC-II)- 
dependent homeostatic expansion under lymphopenic conditions, 
known to be proportional to TCR affinity for self>. CNS3-deficient 
Treg cells expanded markedly compared to CNS3-sufficient counter- 
parts after co-transfer with congenically labelled effector T cells into 
lymphopenic hosts (Tcrb-’~ Terd~/~), probably driven by the recogni- 
tion of self-antigens presented by MHC-II molecules because antibody- 
mediated blockage of MHC-II prevented expansion of Treg cells 
and erased the advantage of CNS3-deficient T,eg cells over their 
CNS3-sufficient counterparts (Fig. 2e, fand Extended Data Fig. 5f). 
Accordingly, the frequency of CNS3-deficient Treg cells was noticeably 
increased in the periphery of Foxp3“°NS3-8/+ compared to Foxp38/* 
heterozygous female mice (Extended Data Fig. 5g). Thus, Treg cells 
developed from precursors lacking CNS3 resulted in a skewed TCR 
repertoire. 

We next examined the TCR repertoires of Treg cells and naive and 
activated CD4* T cells in Foxp3°N%?-8 or Foxp3®” Tera~'* mice 
expressing the DO11.10 TCR§ chain transgene, through which 
TCR diversity is limited to a single functional TCRa chain locus’. 
Barcoded TCRa libraries were generated using an optimized proto- 
col, and high-throughput sequencing data were analysed using the 
MIGEC software package”. Cluster analysis using VDJtools”’ showed 


3 DECEMBER 2015 | VOL 528 | NATURE | 133 


© 2015 Macmillan Publishers Limited. All rights reserved 


LETTER 


a 1D Foxp34cnss-atp b c d 
-= Foxp39? + + ACNS3 8 10 3 
‘ fo) Of 8 10 
-e Relative change of Foxp3* (%) L1CD4*Foxp3* (Foxp3 ) 5 pears pat 5 
Wi CD4*Foxp3* (Foxp3") 08] P= tee eto 5 08 
100 100 (i CD4*Foxp3- 5 os 5 ne 
& 80 80 — Thymus Spleen a _ 3 
E =o. = 04 
ip 60 60 § 8 8 
x 40 40 £ : B02 @ 02 
2 5 | 3 5 
a 20 8 / F 00 eer P=6.49 x10 
0 0 8 O “5-10-05 00 05 1015 OC 45-40-05 00 05 10 15 
2 a - log, expression fold change log, expression fold change 
4 2 1 & 5 2 id 
010 2 et A & 10? set 10° 103 104 10° 103 10* 10 Foxp3®CNS?-0! ys Foxp3a! thymic T,,,  Foxp3°CNS%-a» vs Foxp3a! thymic T,., 
nti- ug mim 
f — All (10,016) — All (11,760) 
e A — Up: aT jeg — Treg (1,226) — Down: TCRok° - WT af,,,. (202) 
BS ec Control Ab Anti-MHC-II — Down: aT 9g ~ Treg (707) bi 
Treg isolation ~  P<0,0001 P=0.47 h 
a 2.0 g 
GFP*T,,, pool £ ) : I Foxp34CNs3-of i Thymus CO Foxp3s0nss-oto 
Theo transfer 245 e ee! 1 I Foxp3s° HELN _ Mi Foxp3e? 
ee et a Spleen © 124p=0.02 
fois Stole g” 0895) ale 
( td © 8 oe, x 
Treg 200877 | Sos ers 3 
3 1 f ‘c 
GFP*T,,., pool fo 8 
Qa 
s E 
Recovery Red Re w Oo 
Transfer and analysis & R anita ° 
3 
Day _4 9 M © CD45.1* (Foxp3%) iy 
+ © CD45.2* (Foxp340Nss-a) 1__ SS ___ SEE Eee 
&b CD4* Ty CD4* T, Teg Treg Th Tett 


Figure 2 | CNS3 shapes the T,cg cell repertoire. a, CNS3 facilitates 

in vitro Treg induction at suboptimal TCR signalling strength. Error bars, 
mean + s.e.m. of triplicate cultures representing 1 of >2 experiments. 

b, Nur77 protein expression in CNS3-deficient and -sufficient Tyeg cells 
(summarized in Extended Data Fig. 4b). LN, lymph nodes. ¢, d, Relative 
gene expression levels (RNA-seq) were compared to the upregulated (up) 
and downregulated (down) genes in activated (aT;eg) versus resting (rT;eg) 
Treg cells from wild-type Foxp3®” mice (c), or to those downregulated in 
aTreg cells subjected to Cre-induced TCR ablation or mock treatment (d). 
The numbers of genes in each comparison are indicated in parentheses. 
Foxp38 (n= 3); Foxp3°CN* 8? (n = 4). One-tailed Kolmogorov-Smirnov 
test. e, f, Analysis of the expansion potentials of CNS3-deficient and 
-sufficient Treg cells in lymphopenic hosts. CD45.1* and CD45.2* Treg cells 


that TCRa repertoires of thymic and peripheral CNS3-deficient 
and -sufficient Teg cells, but not naive or activated effector CD4* T 
cells, were distinct (Fig. 2g). Further analysis showed a significantly 
reduced TCRa diversity of CNS3-deficient Teg cells, but not naive or 
effector CD4* T cells, in comparison to their CNS3-sufficient counter- 
parts (Fig. 2h). As the TCR complementarity-determining region 
3 (CDR3) largely determines TCR specificity for peptide-MHC 
(pMHC) complexes, we assessed the frequencies of strongly inter- 
acting amino acid residues in the TCRa chain CDR3 by leveraging a 
mathematical model linking the features of amino acid residues in the 
CDR3 to TCR affinity for pMHC™. Interestingly, the TCRa CDR3s 
were significantly enriched for strongly interacting amino acid resi- 
dues (Extended Data Fig. 5h, i), and for more randomly added nucleo- 
tides (Supplementary Table 1) in CNS3-deficient versus -sufficient 
Treg Cells, but not naive or effector CD4* T cells. These results implied 
higher affinities of CNS3-deficient T,eg TCRs for self-antigens and 
further supported the notion that CNS3 shapes T,eg TCR repertoire 
by increasing its diversity, probably by enabling Treg differentiation in 
response to a broad range of self-reactivity. 

To understand the functional significance of CNS3-dependent reg- 
ulation of the T;eg cell repertoire, we first assessed the immune sta- 
tus of male Foxp34’? 8? and their wild-type Foxp38? littermates. 
CNS3 deficiency had no observed effect on the numbers of acti- 
vated or memory CD4* or CD8* T cells, or on cytokine production 
by T cells in the secondary lymphoid organs of 8-12-week-old mice 
(Fig. 3a—c and unpublished data). Although CNS3-deficient Treg cells 
were capable of preventing systemic autoimmunity (Fig. 3a—c and 
Extended Data Fig. 1a), it remained possible that the skewed Tyg 
TCR repertoire might have ‘holes. Thus, we reasoned that a select 
few non-lymphoid organs may exhibit focused immune activation in 
CNS3-deficient versus CNS3-sufficient mice, whereas others might be 
similarly or even more protected against autoimmunity by the over- 
represented highly autoreactive Tyeg cells. Indeed, we found increased 
numbers of activated effector T cells and elevated IL-13 and IFNy 
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sorted from mixed bone-marrow chimaeras (MBC) of CD45.1* Foxp38 
and CD45.2* Foxp34-’S3-8? were mixed at a 1:1 ratio and co-transferred 
with wild-type naive Foxp3~CD4* T cells into Terb~'~ Terd~/~ recipients 
treated with MHC-II-blocking antibody (Ab) or control IgG before 

and after the transfer (n =5 per group). Ratios of recovered Treg cells to 
their inputs are shown (f). Two-tailed unpaired Mann-Whitney test, 
representative of 1 of 3 experiments. g, h, Cluster (g) and diversity (h) 
analysis of TCRa repertoires in Foxp3®” (n=5) and Foxp34CN8? (n = 3) 
Tera~'* mice bearing the DO11.10 TCR® transgene. An identical sampling 
size was used to assess the diversity with the inverse Simpson index. 
Coloured bars represent individual mice. Teg, effector CD4* T cells. Error 
bars, mean + s.e.m.; two-tailed unpaired t-test. 


production by T cells in the lungs of Foxp3°°%38? mice (Fig. 3a-c). 
We also observed markedly increased titres of circulating autoanti- 
bodies against several self-antigens in the sera of CNS3-deficient 
mice versus their wild-type Foxp38 littermates, whereas the abso- 
lute amounts of immunoglobulin (Ig) isotypes were comparable 
(Fig. 3d, Extended Data Fig. 6a and unpublished data). This notion was 
further supported by the observed modest, but consistent, decrease 
in the severity of experimental autoimmune encephalomyelitis in 
Foxp3°N°3-8? versus Foxp3®? littermates (Extended Data Fig. 6b-f). 
To compare the suppressive capacity of T;eg cells developed in CNS3- 
deficient and -sufficient mice, we transferred these Ly5.2+ Treg cells 
together with Ly5.1* Foxp3-null (A Foxp3) effector T cells into T-cell- 
deficient recipients (Fig. 3e). Despite comparable expansion of CNS3- 
sufficient and -deficient T;eg cells in the recipients, we observed more 
pronounced weight loss and increased pro-inflammatory cytokine pro- 
duction by effector AFoxp3 T cells in the presence of CNS3-deficient 
Treg Cells in comparison to the control (Fig. 3f-h and Extended Data 
Fig. 6g, h). These results indicate that Treg cells developed from CNS3- 
deficient precursors were selectively impaired in their capacity to sup- 
press self-reactive effector T cells. 

Both negative selection and T,eg cell generation are driven by self- 
antigen recognition in the thymus and probably have complemen- 
tary roles in self-tolerance’”’>”®, We reasoned that the relatively mild 
impairment in suppressive capacity of Treg cells from CNS3-deficient 
mice on a B6 genetic background, resistant to autoimmunity, may not 
fully reveal the biological significance of CNS3-dependent broaden- 
ing of the Teg cell repertoire because of efficient negative selection. 
Therefore, we assessed the consequences of combined deficiency in 
CNS3 and Aire (autoimmune regulator), a nuclear factor required 
for thymic negative selection and optimal T;eg cell generation. Loss 
of Aire leads to diminished expression of a subset of tissue-restricted 
antigens in the thymus and, consequently, an enlarged self-reactive 
effector T-cell pool and diminished T,eg repertoire””””*. In contrast 
to late-onset and mild autoimmunity observed in Aire-knockout (KO) 
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Figure 3 | Defective self-tolerance in the presence of CNS3-deficient 
Treg cells. a-c, Analysis of the activation (CD44"CD62L") (a), IFNy 

(b) and IL-13 (c) production in CD4*Foxp3~ and/or CD8* T cells in 
Foxp34°NS3-? (n= 11) and Foxp3®? (n=9) mice. Two-tailed unpaired 
Mann-Whitney test. MLN, mesenteric LN. d, Analysis of circulating IgG 
against multiple self-antigens in the serum of Foxp38? and Foxp3°°NS>-8i 
littermates (n = 4 per group). Plots show minimum, maximum, first and 
third quartiles and median. Two-way analysis of variance (ANOVA). 
e-h, Compromised suppressor capacity of CNS3-deficient Teg cells 


mice on a B6 genetic background, deficiency of both CNS3 and Aire 
resulted in fatal early-onset aggressive autoimmune lesions in multiple 
tissues as early as 3-4 weeks of age, whereas detectable autoimmune 
inflammation was lacking in littermates with a single deficiency in 
Aire or CNS3 (Fig. 4a and Extended Data Fig. 7a). We noticed a 100% 
(n > 35) penetrance with a stochastic gender-independent variation 
in manifestations expected from perturbations in randomly generated 
repertoires of self-reactive T cells as well as the probabilistic nature 
of negative selection*” (Extended Data Fig. 7a and unpublished data). 
This was accompanied by significant increases in CD4* T-cell acti- 
vation, IFNy production (Fig. 4b, c), serum Ig levels (Extended Data 
Fig. 7b) and autoantibody production (Fig. 4d). Combined Aire and 
CNS3 deficiency resulted in a further reduction in thymic Treg cell fre- 
quency in comparison to the single-deficient mice (1.66 + 0.28% and 
0.91 £ 0.35% in Foxp34NS3-8P AireSOWT and Foxp34 3-8 AireXO/KO 
mice, respectively)”> (Fig. 4e). However, peripheral Treg Cells reached 
normal levels in young Foxp34%S3-8 AireX°/®© mice before develop- 
ment of clinical signs of disease, probably owing to homeostatic prolif- 
eration (Fig. 4f). Despite their normal quantities and Foxp3 expression, 
these T;.g cells were unable to suppress pathogenic self-reactive T cells 
resulting from impaired negative selection in the absence of Aire 
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in vivo. CD45.2* CNS3-sufficient or -deficient Treg Cells were co- 
transferred at a 1:10 ratio with CD45.1* Foxp3-null effector CD4* 

T cells (A Foxp3 Tere) into Terb-/~ Terd~/~ mice (e). Recipient mice 

were monitored for body weight change (f), IFNy (g) and IL-2 (h) 
production. Mice transferred with A Foxp3 Ter alone succumbed to severe 
inflammation and were killed on day 45. A Foxp3 Terp (n = 5); A Foxp3 Terr 
plus Foxp38? (n= 8); A Foxp3 Teg plus Foxp34NS?-8? (n = 6). Two-tailed 
unpaired t-test (f) or Mann-Whitney test (g, h). Error bars, mean and 
s.e.m. (f). Results are representative of 2 independent experiments. 


(Fig. 4b-d and Extended Data Fig. 7c). As diminished thymic Treg cell 
numbers and their skewed TCR repertoire probably contributed to dis- 
ease severity in Foxp3°N8? AireX°/® mice, we directly assessed the 
ability of CNS3-sufficient and -deficient T,.g cells developed in the pres- 
ence of Aire to control Foxp3°NS3 8 AireS°/®© effector T cells when 
adoptively transferred into T-cell-deficient hosts. Although the negative 
effect of CNS3 deficiency on the TCR repertoire was probably mitigated 
by Treg cell expansion in lymphopenic settings, CNS3-deficient T;eg cells 
still exhibited compromised ability to suppress the responses of trans- 
ferred Foxp3° 3-8 AireS°' effector T cells and resident B cells in 
comparison to the controls (Extended Data Fig. 7d-i). These results 
suggest that control of broad self-reactive T cells requires a diverse 
CNS3-dependent repertoire of Tyeg cells. 

Our studies suggest that CNS3, an intronic Foxp3 regulatory element, 
establishes a poised state of the Foxp3 promoter in precursor cells and 
increases the probability of Foxp3 induction in response to TCR stim- 
ulation, particularly within a lower range of signal strength (Extended 
Data Fig. 8). Similar mechanisms of promoter poising may operate in 
other cell types and enable them to respond to a wider spectrum of 
growth factor or morphogen concentrations through receptor-triggered 
analogue signalling. CNS3-mediated Foxp3 promoter poising expands 
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Figure 4 | CNS3-deficient Treg cells fail to maintain self-tolerance in the 
absence of Aire. a, Analysis of tissue inflammation in CNS3 Aire double 
knockout (DKO) mice. n = 3 per group except for Foxp3°ONS*-8P AireX0/KO 
(n=4). Two-tailed unpaired Mann-Whitney test. b, Analysis of activated 
CD4* Ty cells in DKO mice. Foxp38” Aire®°/®° (n= 10); Foxp38P 
AireXOWT (n = 6); Foxp3%°NS3-8P AireXO/KO (y = 11); Foxp3NS-8hP 
AireX©/WT (n = 11). Two-tailed unpaired Mann-Whitney test. c, Analysis 
of IFNy production by CD4*Foxp3~ T cells in DKO mice. Foxp3%” 
Aire®°’/®° (4 =5); Foxp3sh AireXOWT (yn = 3); Foxp34Ss-sip AireXO/KO 
(n=7); Foxp34Ss sip AireX©'WT (n = 9), Data are representative of 

>2 experiments. Two-tailed unpaired Mann-Whitney test. d, Analysis 

of tissue-specific autoantibodies in the serum of DKO mice. Sections of 
skin, intestine and eye from gender-matched Rag1-deficient mice were 
stained with serum IgG (>8 mice per group). DAPI, 4',6-diamidino- 
2-phenylindole. e, f, Analysis of Foxp3+ CD4 SP thymocytes (e) and 
peripheral Tyeg cells (f) in Foxp34°- 2 AireX°/KO mice. Foxp38? 
Aire®°/®° (4 = 10); Foxp38? AireSO'WT (n = 6); Foxp34S3-8ip AireXO/KO 
(n=11); Foxp34S*8l AireSOWT (n = 11). Two-tailed unpaired 
Mann-Whitney test. 


the TCR repertoire of Treg cells, which is essential for controlling path- 
ogenic self-reactive T cells that escape negative selection. 


Online Content Methods, along with any additional Extended Data display items and 
Source Data, are available in the online version of the paper; references unique to 
these sections appear only in the online paper. 
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METHODS 

No statistical methods were used to determine sample size. 

Mice. Foxp3-N3'& mice were generated using ES cell line CY2.4 (C56BL/6) as 
previously described'!. Cd4", Lck@*, Ube“ #8” and Rosa26-stop-YFP (R26Y) 
mice were obtained from the Jackson Laboratories. DO11.10 TCR® trans- 
genic and Aire-knockout mice were provided by P. Marrack, and D. Mathis and 
C. Benoist, respectively. Heterozygous females carrying Foxp3°™ S32 and Foxp3<? 
were crossed with B6 males to generate hemizygous Foxp3*CNS38? and wild- 
type Foxp3*? littermates. Foxp3?"®, Foxp3-null, Rag]~/~, CD45.1+ Foxp38? and 
Terb~/~ Terd~/~ mice were maintained in our animal facility. To study the genetic 
interactions between CNS3 and Aire, heterozygous females of Foxp34°%’? s/s 
were first crossed with AireX°/W, and F, harbouring AireX°/7 and Foxp34 N38 
or Foxp3? were then intercrossed to generate AireX°'®° or AireX°WT mice carrying 
Foxp3°N3-8? or Foxp38. To examine TCR diversity with restricted repertoire, 
Foxp3°°%S> 8/8 heterozygous females were crossed to the DO11.10 TCR® trans- 
genic and Tcra~/* males. F, males of Foxp3°°NS>8?? or Foxp38? mice carrying the 
DO11.10 TCR@ transgene and Tcra~/+ were used for T-cell isolation and TCR 
sequencing. To induce deletion of CNS3 in vivo, tamoxifen solution (40mg ml! 
in olive oil) was administered by gavage to Ubc®"* 28 Foxp337'8i? R26Y mice 
more than 3 days before lymphocyte isolation. 

All mice were maintained in the MSKCC animal facility under SPF conditions, 
and the experiments were approved by the Institutional Review Board (IACUC 
08-10-023). The experiments were not randomized and the investigators were not 
blinded to allocation during experiments and outcome assessment. 

Statistical analysis. Statistical tests were performed with Prism (GraphPad), Excel 
(Microsoft) or R statistical environment. Box-and-whisker plots show minimum, 
maximum, first and third quartiles and median. 

Cell culture. For in vitro Tyeg cell differentiation, naive CD4* T cells 
(GFP-CD25~CD44!°CD62L") or mature CD4*CD8~ SP (TCRBhiGFP~ 
CD25 CD62L"'CD69") T cells were sorted from Foxp3®”, Foxp3°°NS3-8? or 
Foxp3\N3-'8? mice after the enrichment of CD4* T cells or depletion of CD8* 
T cells using Dynabeads FlowComp Mouse CD4 or CD8 kits, respectively 
(Life Technologies), and then cultured with lethally irradiated (20 Gy) antigen- 
presenting cells (splenocytes depleted of T cells with Dynabeads FlowComp 
Mouse CD90.2 kit, Life Technologies) or on plates pre-coated with CD3 and 
CD28 antibodies in RPMI1640 supplemented with 10% fetal bovine serum 
(FBS), 2mM t-glutamine, 1 mM sodium pyruvate, 10 mM HEPES, 2 x 10°°M 
2-mercaptoethanol, 100 U mI! penicillin, 100mg ml"! streptomycin, 500 U ml! 
IL-2 and Ing ml! TGF8. Sodium butyrate (water solution) or iBET solution in 
dimethylsulfoxide (a gift from R. Prinjha) was added to the culture to block his- 
tone deacetylase or bromodomain-containing proteins, respectively. Treg cells were 
sorted on the basis of Foxp3%? reporter expression. Assessment of the stability of 
Foxp3 expression in vitro was performed as previously described*’. Briefly, Treg cells 
were activated in culture in the presence of CD3 and CD28 antibody-coated beads 
(Life Technologies) with the following recombinant pro-inflammatory cytokines: 
IL-2 (250 U ml }), IL-4 (20ng ml), IL-6 (10ngml~1), IFN (100 ng ml!) and 
IL-12 (20ngml~!). 

In vivo suppression assay. To assess T;eg-cell suppressor capacity in vivo we 
conducted adoptive T-cell transfers into T-cell-deficient recipients as previously 
described*!. Briefly, ~2.5 x 10°-3.0 x 10° Foxp3~ CD4* and/or CD8* T cells iso- 
lated from Foxp3-null or Foxp3°°%S?-8? AireS°’/*° mice were transferred to con- 
genic and gender-matched Terb~'~ Terd~/~ recipients alone or at a 10:1 ratio with 
sorted Treg cells from Foxp3®” or Foxp34CNS+-8? littermates. Similar numbers of 
effector T cells and Treg cells were used for in vivo evaluation of Treg suppressor 
function after acute ablation of CNS3. Recipient mice were monitored for body 
weight change regularly and lymphocytes were analysed by flow cytometry at least 
4 weeks after the transfer. 

Treg cell homeostatic proliferation in lymphopenic mice. Around 8-10 weeks 
after bone marrow reconstitution of CD45.1+ Foxp38? and CD45.2+ Foxp34N3-P 
in Terb~'~ Terd~'~ recipients, CD45.1+ and CD45.2+ Treg cells (CD4*GFP*) were 
sorted, mixed at a 1:1 ratio and co-transferred into Terb~/~ Terd~'~ male mice 
with tenfold naive CD4* T cells (CD25~ CD44'°CD62L"') isolated from wild-type 
CD45.2* B6 males. To block TCR stimulation by pMHC-II complexes, 0.5 mg of 
I-Ab-specific monoclonal antibody Y3P (IgG2a) or control IgG2a (Bio X Cell) 
was injected intravenously every other day before and after T-cell transfer*?**. The 
lymphocyte subsets were analysed by flow cytometry 9 days later. 

Flow cytometric analyses and tissue lymphocyte preparation. Tissue lym- 
phocytes were prepared as previously described*'. The following fluorophore- 
conjugated antibodies were used for cell-surface staining: CD4 (RM4-5, 
eBioscience), CD8 (5H10, Life Technologies), CD25 (PC61.5, eBioscience), CD3e 
(145-2C11, eBioscience), CD44 (IM7, eBioscience), CD62L (MEL-14, eBio- 
science), CTLA4 (UC10-4B9, eBioscience), TCR3 (BioLegend), CD45.1 (A20, 


LETTER 


eBioscience) and CD45.2 (104, eBioscience). Antibodies used for intracellular 
staining were: Foxp3 (FJK-16 s, eBioscience), Ki-67 (B56, BD Biosciences), IL-17 
(eBiol7B7, eBioscience), IFNy (XMG1.2, eBioscience) and IL-2 (JES6-5H4, eBio- 
science). To stain endogenous Nur77, cells were incubated with rabbit-anti-Nur77 
antibody (Cell Signaling) after fixation and permeabilization with a Foxp3/tran- 
scription-factor-staining buffer set (eBioscience), followed by phycoerythrin- 
conjugated donkey anti-rabbit antibody (eBioscience). For the flow cytometric 
analysis of cytokine production, lymphocytes were first stimulated in vitro with 
10 mg ml"! of CD3 antibody in the presence of monensin (BD Biosciences) at 
37°C for 5h, then stained with antibodies against indicated cell-surface mark- 
ers followed by staining of cytokines with an intracellular staining kit (BD 
Biosciences). All flow cytometric analyses were performed using live-cell gate 
defined as negative by staining with the LIVE/DEAD Fixable Dead Cell Stain 
Kit (Life Technologies). Flow cytometric analysis was performed with FlowJo 
(Treestar). 

Retroviral transduction. The Cre coding region was subcloned into MigR1-IRES- 
Thy1.1 vector (A. Levine, unpublished data) to generate MigR1-Cre-IRES-Thy1.1. 
Retroviral packaging with regular Phoenix-ECO cells and transduction of Treg cells 
were performed following standard protocols°. 

Autoantibody profiling using autoantigen microarrays. Analysis of auto- 
antibody reactivity against a panel of 95 autoantigens was conducted using the 
autoantigen microarrays developed by University of Texas Southwestern Medical 
Center*. Briefly, serum samples pretreated with DNase-I and diluted at 1:50 
were incubated with the auto-antigen arrays. After a second incubation with 
Cy3-conjugated anti-mouse IgG, the arrays were scanned with a Genepix 4200A 
scanner (Molecular Device). The fluorescent signals for individual autoanti- 
gens were extracted from the resulting images with Genepix Pro 6.0 (Molecular 
Devices), followed sequentially by subtraction of local background, average 
of duplicates, normalization with total IgG, and subtraction of a negative PBS 
control. 

TCR sequencing and data analysis. Cell isolation and RNA extraction. Lymphocytes 
were collected from the peripheral lymphoid organs or thymi of 6-8-week-old male 
Foxp38? or Foxp3°-N3-8 Tera” '* littermates bearing the DO11.10 TCR§ transgene, 
and were enriched for CD4* T cells (Dynabeads FlowComp Mouse CD4 kit, Life 
Technologies) or depleted of CD8* T cells (Dynabeads FlowComp Mouse CD8 kit, 
Life Technologies), respectively, and Tyeg cells (CD4*GFP*), mature Foxp3~ CD4 
SP thymocytes (CD4*CD8~GFP~CD25~ CD62L"'CD69"°), peripheral naive 
(CD4*GFP-CD25- CD44"°CD62L") and effector (CD4*GFP~ CD44"CD62L”) 
CD4* T cells were isolated using a FACSAria II sorter (BD) gated on TCR-V88"". 
Extraction of total RNA from TRIzol-preserved cell lysates was performed accord- 
ing to the manufacturer's instructions (Life Technologies). mRNA was purified 
from total RNA with Dynabeads mRNA DIRECT Kit (Life Technologies) and 
used for reverse transcription. 

cDNA synthesis. To maximize the priming efficiency of reverse transcription, a mix- 
ture of oligo(dT), and eight DNA oligonucleotides corresponding to the mouse 
TCRa constant region was used. The oligonucleotides used in this study were 
synthesized by Integrated DNA Technologies, Inc. 

TRAC_RT1: 5'-CTCAGCGTCATGAGCAGGTTAAAT-3', TRAC_RT2: 
5'-CAGGAGGATTCGGAGTCCCATAA-3', TRAC_RT3: 5'-TTTTACAA 
CATTCTCCAAGA-3', TRAC_RT4: 5'-TTCTGAATCACCTTTAATGA-3', 
TRAC_RT5: 5'-ATGAGATAATTTCTACACCT-3', TRAC_RT6: 5'-TTT 
GGCTTGAAGAAGGAGCG-3', TRAC_RT7: 5'-TTCAAAGCTTTTCTC 
AGTCA-3', TRAC_RT9: 5'-TGGTCTCTTTGAAGATATCT-3'. 

To label the 5' end of TCRa mRNA, a DNA-RNA hybrid oligonucleotide with 
12 random nucleotides serving as barcodes to tag individual mRNA molecules was 
synthesized as previously reported*. 

Hybrid oligonucleotide: AAGCAGTGGTATCAACGCAGAGUNNNN 
UNNNNUNNNNUCTTrGrGrGrGrG (1, ribonucleotide). cDNA was synthesized 
in SMARTScribe reverse-transcription buffer (Clontech) with 1.0,1M each of 
reverse transcription oligonucleotide, 0.5 mM of each dNTP, 5.0 mM of dithioth- 
reitol (DTT), 2.0 Ul! recombinant RNase inhibitor (Takara), 11M hybrid oli- 
gonucleotide, 1 M betaine (Affymetrix), 6mM MgCl, and 5 Upl-! SMARTScribe 
reverse transcriptase by incubating at 42°C for 90 min, followed by 10 cycles of 
incubation at 50°C for 2 min, 42°C for 2 min, and then one step of incubation 
at 70°C for 15 min. After removal of hybrid oligonucleotide with Uracil-DNA 
Glycosylase (New England BioLabs), cDNA was purified with Agencourt AMPure 
XP beads (Beckman Coulter) according the manufacturer’s manual. 
Sequencing library preparation. Purified cDNA was used as templates 
for a four-step PCR amplification, in which sequencing adaptors and sample 
indices were introduced. The first PCR reaction was performed with 
purified cDNA, 0.2,.M universal primer (5'-CTAATACGACTCAC 
TATAGGGCAAGCAGTGGTATCAACGCAGAGT-3’, Clontech), 0.2 1M TRAC 
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reverse primer 8 (5'-TTTTGTCAGTGATGAACGTT-3'), 0.2 mM each dNTP, 
1.5mM MgCl, and 0.02 Upl-! KOD Hot Start DNA Polymerase (EMD 
Millipore). PCR parameters were as follows: initial denature at 95 °C for 2 min; 
10 cycles of 95°C for 20s, 70°C for 10s with an increment of —1°C per cycle, 
and 70°C for 30s; 15 cycles of 95°C for 20s, 60°C for 10s and 70°C for 30s; and 
final cycle at 70°C for 3.5 min. Amplified DNA was purified with Agencourt 
AMPure XP magnetic beads for the subsequent reaction. The second PCR reac- 
tion used the same reactants except that the reverse primer was replaced by 
a nested primer (5'-CAATTGCACCCTTACCACGACAGTCTGGTACACAG 
CAGGTTCTGGGTTCTGGA-3’). Cycling parameters were: 95 °C for 2 min; 
6 cycles of 95°C for 20s, 60°C 10s and 70°C 30s; and a final cycle at 70°C for 
3.5min. DNA from individual samples was extracted with Agencourt AMPure 
XP magnetic beads and used for the third round of amplification with 5RACE 
TCR forward primer (5'-AATGATACGGCGACCACCGAGATCTACACCTA 
ATACGACTCACTATAGGGC-3’) and indexed reverse primer (5'-CAAGCA 
GAAGACGGCATACGAGATXXXXXXAGTCAGTCAGCCCAATTGCACC 
CTTACCACGA-3', XXXXXX for 6-nucleotide barcode). The cycling para- 
meters were: 95°C for 2 min; 6 cycles at 95°C for 20s, 55°C for 10s and 70°C 
for 30s; and a final cycle at 70°C for 3.5 min. The PCR products were puri- 
fied with Agencourt AMPure XP magnetic beads and used for the fourth PCR 
amplification with primers P1 (5'-AATGATACGGCGACCACCGAG-3') and P2 
(5'-CAAGCAGAAGACGGCATACGA-3'), and the following cycling parameters: 
95°C for 2 min; 5 cycles at 95°C for 20s, 57°C for 10s and 70°C for 30s; anda 
final cycle at 70°C for 3.5 min. The final PCR products were separated by agarose 
gel electrophoresis and a single band around 600 base-pairs was cut and extracted 
with Gel Extraction and PCR Clean-Up kits (Takara). 

High-throughput sequencing. Samples were quantified with Kapa Library 
Quantification kits (Kapa Biosystems) and sequenced on a MiSeq sequencer 
(Illumina) using 200 cycles of read 1, 6 cycles of index read and 200 cycles 
of read 2 with the following customized primers: read 1: 5'-CTAATACGA 
CTCACTATAGGGCAAGCAGTGGTATCAACGCAGAGT-3'; index 
read: 5'-TCGTGGTAAGGGTGCAATTGGGCTGACTGACT-3;; read 2: 
5'-AGTCAGTCAGCCCAATTGCACCCTTACCACGA-3'. 

Data analysis. Barcoded sequencing data were analysed with MIGEC software”. 
Briefly, unique molecular identifier sequences were extracted from raw sequenc- 
ing data (read 1) with MIGEC/Checkout routine. Reads (>5) bearing the same 
unique molecular identifier were grouped and assembled to generate consensus 
sequences with MIGEC/Assemble. Variable (V) and joining (J) segment mapping, 
CDR3 extraction, and error correction were performed with MIGEC/CdrBlast as 
previously described”, which eliminates PCR and sequencing errors, as well as 
normalizes the output data as cDNA counts that represent the TCR clonotypes 
ina population®. 

Comparison of TCRa repertoires between CNS3-deficient and -sufficient 
mice at protein level was evaluated using VDJtools post-analysis framework 
(https://github.com/mikessh/vdjtools)”*. Pearson correlation of clonotype fre- 
quencies for the shared TCR clones was used for the generation of the den- 
drogram. Clonal diversities of TCRa repertoires were evaluated using inverse 
Simpson index computed separately for individual samples after downsampling 
the repertoires to the size of the smallest sample from the same organ. Similar 
downsampling strategy, not weighted by clonotype frequencies, was used to 
compute the average size of added nucleotides in CDR3. A mathematical model™ 
was used to assess the strength of CDR3 amino acid interactions with pMHC 
complexes. Numbers of strongly interacting amino acid residues (LFIMVWCY) 
were calculated for the V-segment part of TCRa CDR3 and V-J segment junc- 
tion. Those numbers were then weighted by the corresponding clonotype 
frequencies and the resulting sums were used for the comparisons between 
samples. 

RNA sequencing and data analysis. Mature Foxp3” CD4*CD8~ SP 
(TCR8*+GFP~CD62L"CD69"°) thymocytes, Foxp3* CD4 SP thymocytes (thymic 
Treg Cells), peripheral resting (CD44!°CD62L") and activated (CD44"™CD62L"’) 
Treg cells were FACS-sorted from ~6-8-week-old male Foxp3®” and Foxp34N+sh 
littermates. RNA was extracted and cDNA libraries were generated after SMART 
amplification (Clontech). Libraries were sequenced using a HiSeq 2000 plat- 
form (Illumina) according to a standard paired-end protocol. Reads were first 
processed with Trimmomatic*” to remove TruSeq adaptor sequences and bases 
with quality scores below 20, and reads with less than 30 remaining bases were 
discarded. Trimmed reads were then aligned to mm10 mouse genome with the 
STAR spliced-read aligner**. For each gene from the RefSeq annotations, the 
number of uniquely mapped reads overlapping with the exons was counted 
with HTSeq (http://www-huber.embl.de/users/anders/HTSeq/). Genes with 
fewer than 50 read counts were considered as not expressed and filtered out. 
Principal component analysis (PCA) was performed (n= 11,962) for clustering 


gene expression. Differential gene expression was estimated using DESeq pack- 
age. To determine activation-related transcriptional signatures in Tyeg cells, the 
differences between read counts of peripheral activated versus resting Tyeg cells 
from wild-type Foxp38? mice were evaluated by fold-change and Benjamini- 
Hochberg corrected P values (false discovery rate < 0.001) (Supplementary Data 
1 and 2). For gene expression comparisons, previously published transcriptional 
signatures of TCR-dependent genes in Tyeg cells were used’. The distribution 
of gene expression changes is shown for transcriptional signature genes and 
the rest of all expressed genes. One-tailed Kolmogorov-Smirnoyv test is used to 
determine the significance between the distributions of signature genes and the 
rest of expressed genes. 
Chromatin immunoprecipitation. We cross-linked 1 x 10° cells with 1% formalde- 
hyde for 5 min at room temperature. Cross-linked cells were lysed and nuclei 
were resuspended in 250 11 nuclear lysis buffer containing 1% SDS. Chromatin 
input samples were prepared by sonication of cross-linked nuclear lysates. For 
histone ChIPs, nuclear lysates were subjected to micrococcal nuclease (MNase) 
digestion before sonication. Nuclei were resuspended in 10011 MNase (New 
England Biolabs) at 12,000 U ml! for 1 min at 37°C. The reaction was stopped 
by addition of 10,1 of 0.5 M EDTA. Chromatin input samples were incubated 
overnight at 4°C with antibodies against H3K4mel (Abcam), H3K4me3 
(Millipore) or H3K27ac (Abcam), and precipitated for 90 min at 4°C using protein 
A Dynabeads (Life Technologies). After thorough washing, bead-bound chro- 
matin was subjected to proteinase K digestion and decrosslinking overnight at 
65°C. DNA fragments were isolated using a Qiagen PCR purification kit. Relative 
abundance of precipitated DNA fragments was analysed by qPCR using Power 
SYBR Green PCR Master Mix (Applied Biosystems). The following primers were 
used for qPCR: Gm5069: forward: 5'-TAAGCAATTGGTGGTGCAGGATGC-3', 
reverse: 5'-AAAGGGTCATCATCTCCGTCCGTT-3’'; Hspa2: forward: 5'-TC 
GTGGAGAGTTGTGAGAAGCGA-3’, reverse: 5'-AACGTTAGGACGAAA 
GCGTCAGGA-3'; Hsp90ab: forward: 5'-TTACCTTGACGGGAAAGCCG 
AGTA-3', reverse: 5'-TTCGGGAGCTCTCTTGAGTCACC-3'; Rp/130: 
forward: 5'-TCGGCTTCACTCACCGTCTTCTTT-3’', reverse: 5'-TG 
TCCTCTGTGTATGCTAGGTTGG-3'; Foxp3 promoter: forward: 
5'-TAATGTGGCAGTTTCCCACAAGCC-3', reverse: 5'-AATACCTC 
TCTGCCACTT TCGCCA-3'; CNS1: forward: 5'-AGACTGTCTGGA 
ACAACCTAGCCT-3’, reverse: 5'-TGGAGGTACAGAGAGGT TAAGAGCCT-33; 
CNS2: forward: 5'-ATCTGGCCAAGTTCAGGTTGTGAC-3’, reverse: 
5'-GGGCGTTCCTGTTTGACTGTTTCT-3'; CNS3: forward: 5'-TCTCC 
AGGCTTCAGAGATTCAAGG-3’', reverse: 5'-ACAGTGGGATGAGG 
ATACATGGCT-3'. 

Relative enrichment was calculated by normalizing to background binding to 
the control region (Gm5069). 
Ig isotype ELISA and immunofluorescence staining. Quantification of serum 
Ig isotypes was performed by ELISA as previously described*°. Tissue sections 
from gender matched Rag1~/~ mice were used to detect mouse autoantibodies. 
Briefly, organs from the Rag1 ~'~ mice were dissected, fixed with neutral buffered 
formalin, embedded with paraffin and sectioned. After deparaffinization with 
EZPrep buffer (Ventana Medical Systems) and antigen retrieval with cell condi- 
tioning solution (Ventana Medical Systems) the sections were blocked for 30 min 
with Background Buster solution (Innovex), followed by avidin/biotin blocking 
for 8 min, mouse serum (1:50 dilution) incubation for 5h and biotinylated horse 
anti-mouse IgG (Vector Labs) incubation for 1h. The detection was performed 
with streptavidin—-horseradish peroxidase (Ventana Medical Systems) followed 
by incubation with Tyramide Alexa Fluor 488 (Invitrogen). The slides were then 
counterstained with DAPI (Sigma Aldrich) for 10 min, mounted, scanned with a 
Mirax scanner and visualized with Pannoramic Viewer (3DHISTECH). Scanned 
images were scored and representative snapshots were processed with Photoshop 
(Adobe) to switch the green and red channels for presentation purpose. 
Generation of mixed bone marrow chimaeras. Mixed bone marrow chimaeras 
were generated as previously described*". Briefly, recipient mice were irradiated 
(9.5 Gy) 24h before intravenous injection of 10 x 10° bone marrow cells from 
CD45.1* Foxp38 and CD45.2* Foxp34%S>-8 mixed at a 1:1 ratio. After bone 
marrow transfer, the recipient mice were administrated with 2mg ml ' neomycin 
in drinking water for 3 weeks and analysed 8-10 weeks later. 
Histological analysis. Tissue samples were fixed in 10% neutral buffered formalin 
and processed for haematoxylin and eosin staining. Stained slides were scored for 
tissue inflammation as previously described"’. 
Experimental autoimmune encephalomyelitis induction. Experimental auto- 
immune encephalomyelitis was induced by immunization with myelin oligo- 
dendrocyte glycoprotein peptide 35-55 (MOG35-55, GenScript) in complete 
Freund’s adjuvant (CFA, Sigma) and mice were monitored for disease as previously 
described®. 
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Extended Data Figure 1 | CNS3 is required in precursor cells for represent 1 of >2 independent experiments (n > 3 mice per group). 
optimal T,., cell differentiation. a, Diminished numbers of thymic Treg f, Acute ablation of CNS3 impairs T,cg induction in vitro. Yellow 
cells in 6-8-week-old CNS3-deficient mice. Two-tailed Mann-Whitney fluorescent protein (YFP)* (tamoxifen treated) or YFP~ (vector 
test. The data show individual mice and median, and represent 1 of control) naive CD4* T cells from Ube? #8? Foxp3-NS3S sip R26Y males 
>2 independent experiments. Foxp38” (n= 9); Foxp3°NS?-8? (n= 11). were cultured under in vitro Treg induction conditions. The data show 
SI, small intestine. b, c, Flow cytometric analysis of CD4 and CD8 SP mean + s.e.m. of triplicate cultures and represent 1 of 2 independent 
thymocyte subsets, including thymic Tyeg precursor ( CD4*+CD25*Foxp3~) experiments. Two-tailed unpaired t-test. g, h, Acute ablation of CNS3 in 
cells (b) and Foxp3 expression (c) in 6-8-week-old Foxp34°NS3-8? mice differentiated T,-g cells does not affect Foxp3 expression level on a per cell 
(n=11) and Foxp38” (n= 9) littermates. Unpaired Mann-Whitney basis or the stability of mature Treg cells. g, Expression of Foxp3, CD25 
test. d, e, CNS3-dependent T,.g cell differentiation in heterozygous and CD44 in Treg cells on day 4 after tamoxifen treatment. h, YFP* and 


Foxp3 N83 8!+ and Cd4@" Foxp3N'8P'+ (d), or Foxp3CN3-S'8hh/+ and YFP~ Treg cells from tamoxifen-treated Ube“?! Foxp3-N3I-a R26Y 
Lek"? Foxp3°N3J'8hh'+ females (e). GFP* and GFP™ Treg cells in these mice _ males were cultured in the presence of IL-2, IFNy, IL-4, IL-6 and IL-12 for 
express Foxp3-’3/'8? or wild-type Foxp3* alleles, respectively. The data 4 days. The data represent 2 independent experiments. 
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Extended Data Figure 2 | CNS3 is dispensable for the suppressor 
function of differentiated T,.g cells in vivo. a—f, In vivo assessment of the 
suppressor function of Treg cells upon acute ablation of CNS3. Treg cells 
(CD4*+GFP*) isolated from Foxp38? or Foxp3-%S3/'8? mice were activated 
with CD3 and CD28 antibody-coated beads in vitro for five days and then 
transduced with retroviruses expressing Cre recombinase and a Thy1.1 
reporter. Three days later, Thyl.1*CD4*GFP* cells were sorted by FACS 
for the suppressor assay. a, CD4*Foxp3” and CD8* effector T cells (Terr) 
sorted from Foxp3°"® reporter mice seven days after diphtheria toxin (DT) 


injection (1 ug intraperitoneal per mouse) were transferred alone or with 
equal amounts of Thy1.1+ Cre-transduced Foxp38? or Foxp3CNS3S- slp 

Treg Cells into Terb-'~ Terd~!~ recipients. b, Mice were weighed before and 
after T-cell transfer, and relative weight changes were assessed at weeks 

3 and 4 post-transfer. c-f, Four weeks after adoptive transfer, cells were 
recovered and analysed for T,eg frequencies and Foxp3 expression (c), 
CD4*tTCR8*Foxp3” and CD8tTCR8* cell numbers (d), IFN7 (e) and 
IL-13 (f) production. Unpaired Mann-Whitney test (n=5 per group). 
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Extended Data Figure 3 | Epigenetic modifications at the Foxp3 locus 
during Teg differentiation. a, b, ChIP-qPCR analysis of H3K4me3 (a) 
and H3K27ac (b) at the Foxp3 locus and control loci (Hspa2, Rp130 and 
Gm5069) in B cells, DP thymocytes, naive CD4* T and Treg cells. FACS- 
sorted cells from wild-type male Foxp3?'® mice were used for ChIP-qPCR. 
Relative enrichment was calculated by normalizing to background binding 
to control region (Gm5069). c—e, ChIP-qPCR analysis of H3K4mel (c), 
H3K4me3 (d) and H3K27ac (e) in the Foxp3 locus in mature Tyeg cells 
isolated from wild-type Foxp3®? and Foxp3°C%*8 male mice normalized 
to the background binding to the Gm5069 locus. f, CNS3-dependent 
deposition of H3K27ac at the Foxp3 promoter in Foxp3~ CD4* T cells 


P, mature CD4 


during in vitro Treg cell induction. Foxp3®” or Foxp3°°N°> 8? naive CD4* T 
cells were cultured under in vitro Treg cell differentiation conditions. After 
three days of culture, GFP” and GFP* cells were sorted for ChIP-qPCR 
analysis. Two-tailed unpaired t-test. g, Inhibition of Teg induction in 

vitro by bromodomain protein inhibitor iBET. Naive CD4* T cells from 
wild-type Foxp38” males were used for Foxp3 in vitro induction in the 
presence of indicated concentrations of iBET or vehicle. h, Schematic of 
the chromatin dynamics at CNS3 and the Foxp3 promoter during Treg cell 
differentiation. The data are shown as means + s.e.m. of triplicates and 
represent 1 of 2 independent experiments. 
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Extended Data Figure 4 | CNS3 facilitates Foxp3 induction and shapes 
Treg cell repertoire. a, Differential effect of CNS3 on Treg cell in vitro 
development of mature non-T,., CD4 SP T cells. CD4 SP thymocytes 
(CD4*CD8~-TCRB"'GEP- CD25~ CD62L"'CD69!") were pooled and sorted 
from male Foxp38? and Foxp34°N?-8? littermates (n =7 each group) 
for in vitro Tyeg cell induction performed with titrated CD3 antibody 
and lethally irradiated antigen-presenting cells isolated from wild- 

type B6 spleens in the presence of TGF8 and recombinant IL-2. Foxp3 
expression was analysed four days later and the relative changes in the 
ratios of Foxp3-expressing cells in the absence of CNS3 were calculated 
by comparing to CNS3-sufficient groups. Data depict means + s.e.m. 
of five replicate cultures and represent 1 of 3 independent experiments. 
b, Flow cytometric analysis of Nur77 protein expression in CNS3- 
deficient and -sufficient Treg cells (1 = 5 for each group). Two-tailed 
unpaired Mann-Whitney test. The data represent 1 of >2 independent 
experiments. c, Increased Nur77 protein levels in CNS3-deficient Treg 
cells developed after conditional ablation of CNS3 upon tamoxifen- 
induced activation of Ubc°"*=*!?, Bone marrow of CD45.1+ Foxp3®? 
and CD45.2* Ubc©? #8? Foxp3NS3J'sP R26Y mice were collected from 
donor mice treated with tamoxifen, mixed at a 1:1 ratio and transferred 
into lethally irradiated Terb~/~ Terd~/~ recipients. CD45.1+CD4* GFP*, 
CD45.2* YFP” GFP* and CD45.2*YFP*GFP* cells were sorted for flow 


cytometric analysis of Nur77 protein levels 10 weeks after bone marrow 
transfer (n = 5). Unpaired Mann-Whitney tests were used to compare 
CD45.2*YFP*GFP* and CD45.2* YFP GFP* or CD45.2*YFP* GFPt 
and control (CD45.1*CD4*GFP*) groups. The data show medians of 
individual mice and represent >3 independent experiments. d, Nur77 
expression levels in thymic T;eg precursors (CD25*Foxp3_ ), immature 
(CD62L'°CD69") and mature (CD62L"'CD69!°) CD4 SP thymocytes, 

and peripheral Foxp3~ CD4* and CD8* T cells in 6-7-week-old Foxp38? 
(n=5) and Foxp34S3 ip (n=4) littermates. Unpaired Mann-Whitney 
test. The data show medians of individual mice and represent >3 
independent experiments. e, Differential Nur77 expression in peripheral 
resting (CD44!CD62L") and activated (CD44"'CD62L") Treg cells 
(wild-type Foxp38?). The data represent 1 of >3 independent experiments. 
f, g, Upregulation of Nur77 expression in resting (CD44!°CD62L"’) (f) and 
activated (CD44"1CD62L") (g) CNS3-deficient Treg cells in 6-7-week-old 
Foxp38? (n=5) and Foxp34°N38? (n = 4) littermates. Unpaired Mann- 
Whitney test. The data represent >3 experiments. h, i, CTLA4 (h) and 
Ki-67 (i) expression by CNS3-deficient and -sufficient Tyeg cells in Foxp38P 
(n=9) and Foxp34NS> 8 (n=11) mice (h). Foxp38P (n=5); Foxp3°S3- al 
(n=4) (i). Two-tailed unpaired Mann-Whitney test. The data represent 

1 of >3 independent experiments. 


© 2015 Macmillan Publishers Limited. All rights reserved 


LETTER 


a b 


Mature Foxp3- CD4 sci 
Thymic T,,, i i 
Resting T,.. Ii I 

Activated T... 


a % 


Thymic Ths 


PC2 


Mature Foxp3- 
CD4 SP 
Resting T, 


Qe 


Activated T,,., 


Cumulative fraction of genes 


02 04 06 O8 1.0 


0.0 


1.0 


P=2.17x10°* 
P=7.45x10% 


P=3.45x10-7 
P=2.43x107 


06 0.8 


0.4 


0.2 


Cumulative fraction of genes 
0.0 


15-10-05 00 05 10 15 

Log, expression fold change 

Foxp3?°Ns3-s vs Foxp39” rT. 
reg 


-1.5 -1.0 -05 0.0 05 1.0 15 

Log, expression fold change 
.ACNS3-gfp rf 

Foxp3 9 vs Foxp3a aT eg 


— All (10,029) 
— Up: aT,,,1T,,, (1,226) 
— Down: aT,,.-1T 2, (707) 
d e f g 
go oe S. 
cx cx oO 
P<0.01 
D., 5. f aes i ‘© Spleen 
a 2° 25 = @® Thymus 
ge ee gS a 
BS BO 2 20 
8 . 8 . & S| 0 rs 
er) P=2.12x10- "oO | ir ts = 
g° .. S S 
Ba ao © 10: A 
s° $e 3 i 5 P=0.06 
Ee Ee 2 5 re = 
o) f=) [S) ° 2 
“15 -1.0 -05 00 05 1.0 15 -1.5 1.0 0.5 00 05 1.0 15 0 A 
Log, expression fold change Log, expression fold change RN 
Foxp34en"s3-9 ys Foxp39° re Foxp340Ns3-a ys Foxp39° aT ; & » oO g§ ry 
reg (5) ie Loy 
oO 
— All (11,760) mi Foxp3?” (Control IgG) & go 
— Down: TCRak°-WT aT. (202) Foxp39” (Anti-MHC-II) on 
mm Foxp34°%53-s (Control IgG) 
Foxp34enss-@ (Anti- MHC-II) 
h i 
Thymocytes Periphery 
no Le ap no lr 
8 ee eg i 8 06 on a ee a 
5 —= P=0.80 ° 
£ £ P=0.003 P=0.14 
5 & 05 
pane — f a "| is 
= Rey = . Key 
S g S g 
5 a ae als = 
=e 0.354 = 
= P=0.003 2 P=0.86 
i=4 = 
g ° 0. 
3 @ 
‘So P=0.94 So 0.304 
Fe —— he P=0.003 
B 0.205 3 P=0.41 
3 E 3% 020 = 
£ 0.154 3 ee 3 
2 2 0154 
g P=0.02 — = P=0.05 pete 
0.10 


mh Foxp34onss-sie 

®@ Foxp39 
Extended Data Figure 5 | Influence of CNS3 on Tyeg cell repertoire. 
a, Principal component analysis of mRNA expression in CNS3-deficient 
and -sufficient mature Foxp3” and Foxp3* CD4 SP thymocytes, and 
peripheral resting and activated Tyeg cells. RNA-seq was performed with 
three and four biological replicates for cells sorted from male Foxp3®” 
and Foxp34-*>8? littermates, respectively. Dots represent samples from 
individual mice. b, c, Relative gene expression levels (cumulative fraction 
of genes) in CNS3-sufficient and -deficient peripheral resting (b) or 
activated (c) Treg cells in comparison to those up- and downregulated 
in activated versus resting Tyeg cells isolated from Foxp3®” mice. The 
numbers of genes in each comparison group are indicated in parentheses. 
d, e, Relative gene expression levels in CNS3-sufficient and -deficient 
peripheral rTyeg (d) or aTyeg (€) cells in comparison to those downregulated 
in activated Tyeg cells subjected to acute TCR ablation versus mock 
treatment. The numbers of genes in each comparison group are indicated 
in parentheses. One-tailed Kolmogorov—Smirnov test. f, Flow cytometric 
analysis of Foxp3 expression level (median fluorescence intensity (MFI)) 
in CNS3-sufficient and -deficient Treg cells after expansion in lymphopenic 
recipients. Tyeg cells were sorted from mixed bone marrow chimaeras 
of CD45.1* Foxp38? and CD45.2* Foxp34’’38 mice and mixed at a 
1:1 ratio, and co-transferred with wild-type naive Foxp3 CD4* T cells 
into Terb~/~ Terd~'~ recipients treated with MHC-II-blocking antibody 
or isotype-control IgG before and after the transfer (n =5 per group). 


Mean + s.e.m; the data represent 1 of 3 independent experiments. 
Unpaired t-test revealed no statistically significant difference between 
matched CNS3-deficient and -sufficient groups (P > 0.3). g, Comparison 
of CNS3-sufficient and -deficient Teg cells in competitive environment of 
heterozygous Foxp3#?’+ and Foxp34CNS3-2P/+ females (6-8 weeks of age). 
In contrast to CNS3-sufficient Treg cells, CNS3-deficient cells are relatively 
enriched in the periphery in comparison with the thymus. Ratios of GFP 
to GFP* Tyeg cells are inversely proportional to the relative abundance of 
Foxp3®? or Foxp3°%S*&? T,.. cells in the Tyeg pool. Wilcoxon matched- 
pairs signed rank test; Foxp3s/+ (n =5), Foxp34NS-8P/+ (n= 8). Linked 
circles represent samples from the same mice. Data represent 1 of 

2 independent experiments. h, i, Numbers of strongly interacting amino 
acid residues (LFIMVWCY) were calculated for the V-segment of TCRa 
CDR3 (binned to germline) and V-J segment junction, and weighted by 
the corresponding clonotype frequencies. Sums of the weighted scores 
were used for the comparisons between CNS3-deficient and -sufficient 
groups (unpaired t-test). The data represent the analysis of pooled TCR 
sequences derived from the indicated thymic (h) and peripheral 

(i) CD4* naive (T,), activated effector (Ter) and Treg cell subsets isolated 
from individual Foxp38”? (n=5) and Foxp34S*-8? (n = 3) mice. Box-and- 
whisker plots show minimum, maximum, first and third quartiles and 
median. 
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Extended Data Figure 6 | Selective modulation of autoimmune frequency (d), Foxp3 expression levels (e) and inflammatory cytokine 
responses in mice lacking CNS3. a, CNS3 deficiency does not affect production (f). Foxp38? (n= 8); Foxp3°CNS3-8 (n = 11). Unpaired t-test 
antibody production against a subset of autoantigens. Foxp34%S?8/? (b) or Mann-Whitney test (c-f). Mean and s.e.m. are presented (b). 
and Foxp38? littermates (n= 4 per group). Box-and-whisker plots *P<0.05, **P<0.01. The data represent 2 independent experiments. 
show minimum, maximum, first and third quartiles and median. Data g, h, Analysis of the proportion of Tyeg cells in CD4*TCR@* cell population 
represent 1 of 2 independent experiments. b-f, CNS3 deficiency decreases _ (g) and level of Foxp3 expression (MFI) (h) in an in vivo suppressor assay 
experimental autoimmune encephalomyelitis severity. On immunization of CNS3-deficient or -sufficient Teg cells (Fig. 3e-h). Two-tailed unpaired 
with MOG peptide in CFA, mice of indicated genotypes were assessed Mann-Whitney test. 


for the severity of limb paralysis (b), effector T-cell numbers (c), Treg cell 
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Extended Data Figure 7 | Compromised suppressive function of CNS3- 
deficient Tg cells. a, Autoimmune diseases in Foxp34S3 sip AireX0/KO 
(DKO) mice. Arrow indicates the inflammatory lesions in the tail of a 
3-week-old mouse with an early onset of autoimmunity (n >11) (i). 
De-pigmentation in a 6-week-old mouse with delayed onset of 
autoimmunity (n > 16) (ii). b, Analysis of serum Ig isotypes in 
Foxp34°NS3-8? AireX°/*© and littermate control mice using ELISA 

(n=8 per group). Error bars, mean + s.e.m. Two-way ANOVA. c, Flow 
cytometric analysis of Foxp3 expression by Treg cells. The data show one of 
at least three mice per group and represent >3 independent experiments. 
d-i, Analysis of the ability of CNS3-deficient and -sufficient Teg cells to 


oil CNS3 Aire DKO effector T cells on adoptive transfer into 
T-cell-deficient recipients. Flow cytometric analysis of non-T,eg CD4* 
T-cell numbers (d), Treg cell numbers (e), Foxp3 expression levels (f), IFN 
production (g), IL-17 production (h), and serum IgG1 and IgG2b levels 

(i) in recipient mice transferred with CNS3 and Aire DKO effector 

T cells (Foxp3~ CD4* and CD8*) at a 10:1 ratio with Tye cells from 
Aire-sufficient Foxp3®? or Foxp34-'S*8? mice. Two-tailed unpaired 
Mann-Whitney tests (d—-h) or unpaired t-test (i). Error bars, 

mean + s.e.m. (i). The recipient mice were analysed 7 weeks after 
adoptive T-cell transfer (n =5 per group). 
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Extended Data Figure 8 | Theoretical impact of CNS3 on Treg TCR repertoire by facilitating Foxp3 expression predominantly in response to 
repertoire . a, Hypothetical distribution of TCRs expressed by Tyeg and lower strength (‘suboptimal’) inducing TCR signals. b, After expansion 
non-Tyeg CD4* T cells according to their affinities for self-antigens. in the periphery, CNS3-deficient Treg cells reach similar numbers as their 
Precursor cells expressing TCRs within a certain low-affinity window wild-type counterparts, with some TCRs underrepresented (A), some 

are positively selected and become ‘conventional’ CD4* T cells, and minimally affected (B), and some overrepresented (C). Tony, conventional 
those with higher affinities for self-antigens differentiate into Treg cells. T cells. 

CNS3 promotes the differentiation of Tye cells and broadens their TCR 
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Depletion of fat-resident T,eg cells prevents 
age-associated insulin resistance 


Sagar P. Bapat!?, Jae Myoung Suh, Sungsoon Fang”, Sihao Liu*, Yang Zhang!, Albert Cheng!, Carmen Zhou!, Yugiong Liang’, 
Mathias LeBlanc’, Christopher Liddle®, Annette R. Atkins’, Ruth T. Yu*, Michael Downes?, Ronald M. Evans”® & Ye Zheng! 


Age-associated insulin resistance (IR) and obesity-associated IR 
are two physiologically distinct forms of adult-onset diabetes. 
While macrophage-driven inflammation is a core driver of 
obesity-associated IR’, the underlying mechanisms of the obesity- 
independent yet highly prevalent age-associated IR’ are largely 
unexplored. Here we show, using comparative adipo-immune 
profiling in mice, that fat-resident regulatory T cells, termed 
fT, eg cells, accumulate in adipose tissue as a function of age, but 
not obesity. Supporting the existence of two distinct mechanisms 
underlying IR, mice deficient in fT,., cells are protected against 
age-associated IR, yet remain susceptible to obesity-associated IR 
and metabolic disease. By contrast, selective depletion of fT, cg cells 
via anti-ST2 antibody treatment increases adipose tissue insulin 
sensitivity. These findings establish that distinct immune cell 
populations within adipose tissue underlie ageing- and obesity- 
associated IR, and implicate fT,.. cells as adipo-immune drivers and 
potential therapeutic targets in the treatment of age-associated IR. 

The young, lean state is associated with insulin sensitivity, while 
both ageing and obesity can lead to the development of IR (Extended 
Data Fig. 1a). To explore key immune cell types that drive age- versus 
obesity-associated IR, we quantitatively profiled the immune cell com- 
ponents of adipose depots using a flow cytometry approach termed 
adipo-immune profiling (AIP) (Extended Data Fig. 1b-d and Extended 
Data Table 1). In contrast to the decrease in anti-inflammatory M2 
adipose tissue macrophages and eosinophils observed in obesity-driven 
IR, AIP revealed that these cell populations are largely unperturbed in 
visceral adipose tissue (VAT) from aged mice®*-? (M2 adipose tissue 
macrophages, aged: 33.6 + 3.8% (mean +s.d.), young: 29.8 + 4.1%, 
obese: 22.9 + 6.3%; eosinophils, aged: 4.4% + 1.6%, young: 4.7% + 0.7%, 
obese: 0.8% + 1.0%; Fig. 1a). Instead, the relative portion of the 
non-macrophage compartment is significantly increased in aged com- 
pared to young or obese mice (aged: 24.3 + 4.6%, young: 17.9 +2.8%, 
obese: 15.7 + 3.8%; Fig. 1a), which is largely attributable to an 
~12-fold expansion in the fT, eg cell population!*"4 (aged: 5.0 + 1.2%, 
young: 0.4+0.1%, obese: 0.1 + 0.1%; Fig. la, b). These condition- 
dependent AIP signatures of adipose tissue suggest that distinct patho- 
physiologic processes drive age- and obesity-associated IR and specifi- 
cally implicate fT, 2g cells in age-associated IR. 

Treg Cells in fat express Pparg at a high level, which allows them to 
expand their relative numbers approximately 6-7-fold!°. Knockout of 
Pparg in Treg cells blocks this accumulation. Accordingly, we exploited 
this observation by creating Foxp3“"* (Foxp3-IRES-YFP-Cre) Pparg’’! 
mice in which Tyeg cells are selectively depleted (from 6.1% to 0.9%) 
from VAT? (fT, reg Knockout mice; Fig. 2a and Extended Data Fig. 2a, b), 
although the depletion of PPAR+-positive Treg cells in tissues such as 
muscle and liver cannot be ruled out. This depletion is achieved with- 
out significantly altering the immune profiles of subcutaneous adipose 
tissue (SAT) or spleen (Extended Data Fig. 2c, d). Importantly, the 


6.8-fold VAT-specific loss of fT, eg cells does not elicit any overt signs of 
systemic inflammation generally associated with T,.g cell dysfunction. 
Aged fT eg knockout mice have normal-sized spleens and increased 
CD62L" CD44" naive CD4* T-cell populations compared to wild-type 
controls (Fig. 2c and Extended Data Fig. 3a). The normal intestinal 
histology provides additional evidence that the T,.g cell population is 
not perturbed (Extended Data Fig. 3b, c). Furthermore, no differences 
are observed in the levels of inflammatory cytokines, including TNFa, 
IL-1, IL-6, IFNy and IL-17, in the serum of aged fT, reg knockout com- 
pared to control mice (Extended Data Fig. 3d). 

Notably, the selective loss of fTreg cells attenuates many of the hall- 
marks of age-associated metabolic dysregulation’*®. They weigh less 
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Figure 1 | fT eg cells are selectively enriched in aged mice. a, VAT AIPs 
from mice at 12 weeks (young, n= 10), 44 weeks (aged, n = 10) and in 
diet-induced obese mice (n = 10). Immune cell abundance is expressed 

as percentage of CD45.2* cells. ATM, adipose tissue macrophages; DN, 
double negative. b, Changes in immune cell abundance between indicated 
groups, expressed as fold change in cell number per gram of VAT. Obese 
mice were fed a HFD for 12 weeks from 12 weeks of age. NK, natural killer 
cells; NKT, natural killer T cells. Data are mean +s.e.m. #, false discovery 
rate <2%. 
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Figure 2 | fI,-g knockout mice are protected from general hallmarks 

of metabolic ageing. a, Representative FACS plots of fT,.g knockout 

(KO) (Foxp3“* Pparg’") and control (Foxp3“* Pparg*'+) mice depicting 
Treg cell enrichment in VAT and spleen (~15 months, CD45.2* CD4* 
gating). b, Total body weight (n = 15 per group), and lean and fat mass of 
control and fTyeg knockout mice (~12 months, n =8 per group). c, Mass 

of VAT, SAT and spleen in aged control and fT,.g knockout mice 

(~15 months, n=9 per group). d, Cumulative food consumption of 

control and fT,.g knockout mice (~8-9 months, n = 8 per group). 

e, f, Average 24h respiratory exchange ratio (RER) (e) and average oxygen 
consumption (VOz) (f) of aged control and fT,eg knockout mice 

(~11 months, n=6 per group). g, Core body temperature of control and 

fT eg mice (~13 months, n =9 per group). h, Principal component analysis 
of non-macrophage AIPs of young (12 weeks), aged (~15 months) and aged 
fT eg knockout (~15 months) mice (n=9 per group). Data are mean+s.e.m. 
*P<0.05, **P<0.01, ***P< 0.001, ****P< 0.0001, Student's t-test. 


than control mice and are leaner (decreased VAT and SAT adiposity) 
despite increased food consumption (Fig. 2b-d). In addition, the res- 
piratory exchange ratio (Fig. 2e), oxygen consumption (Fig. 2f) and 
core body temperature (Fig. 2g) are increased in aged fT,eg knockout 
mice compared to control mice. These marked improvements suggest 
that the age-associated metabolic phenotype is closely linked with VAT 
immune responses, and that in the aged setting, a reduction in fT, reg cell 
numbers may be protective. Indeed, the AIPs of aged fT eg knockout 
mice are shifted towards those of young mice, as visualized by principal 
component analysis (Fig. 2h). 

Although the fTreg knockout phenotype is most pronounced in aged 
mice, a reduction in fT;eg cell levels can also be found in obese fTyeg 
knockout mice (Fig. 3a). However, the beneficial metabolic effects 
of fT,g ablation are only significant in age-associated metabolic 
dysregulation, in which fasting serum glucose and insulin levels are sig- 
nificantly reduced (Fig. 3b, c and Extended Data Fig. 4a). Furthermore, 
aged fT,2g knockout mice display smaller glucose excursions during 
glucose tolerance tests and increased sensitivity during insulin tolerance 


138 | NATURE | VOL 528 | 3 DECEMBER 2015 


tests compared to weight-matched control mice (Fig. 3e). Again, these 
improvements in glucose homeostasis are observed only in aged mice; 
no significant differences are seen in young or obese fTreg knockout 
mice (Fig. 3d, f), which is consistent with the largely unchanged AIPs 
of obese fTeg knockout mice (Extended Data Fig. 5a, b). 

Although fTyeg cells were previously implicated in the insulin- 
sensitizing function of the PPAR agonists thiazolidinediones 
(TZDs)"°, in our studies fT reg Knockout mice display similar meta- 
bolic improvements to the TZD rosiglitazone as control mice. These 
beneficial effects of TZDs are evident with either direct treatment of 
obese mice (therapeutic intervention; Extended Data Fig. 6a—g) or 
prophylactic treatment (drug intervention coincident with high-fat diet 
(HFD) feeding; Extended Data Fig. 6h-l). Additionally, we find that the 
insulin-sensitizing effects of rosiglitazone precede the TZD-induced 
expansion of fT,2g cells in HFD-fed mice (Extended Data Fig. 6m-q). 
While intercolony variation cannot be ruled out, our findings do not 
support a significant role for fTreg cells in the therapeutic mechanism 
of action of TZDs. 

Histologically, aged fT;eg knockout VAT depots appear similar 
to control mice, and inflammatory processes such as macrophage 
crowning are observed at comparable frequencies (Fig. 3g and data 
not shown). However, aged fTyeg knockout VAT has increased levels 
of TNFa (Extended Data Fig. 7a), increased expression of Vegfa 
(implicated in adipose remodelling and insulin sensitivity'?; Extended 
Data Fig. 7b) and decreased expression of extracellular matrix genes 
(including collagen VI implicated in adipose tissue rigidity'®, and the 
wound response gene Sparc; Extended Data Fig. 7b, c) compared to 
control tissue. Accompanying these changes, several proteases involved 
in extracellular matrix remodelling and angiogenesis (members of 
the ADAM, ADAMTS, MMP and CELA families) are differentially 
expressed (Extended Data Fig. 7b, d). Of note, adipocytes from aged 
fT;eg knockout mice are smaller than those in control mice (fTreg 
knockout: ~70% <5,000|1m?, control: ~41% <5,000|1m’; Fig. 3h and 
Extended Data Fig. 4b), and serum non-esterified free fatty acid levels 
are reduced to almost half those of control mice; both indicators of 
improved insulin sensitivity (Fig. 3i). In addition, circulating levels of 
the adipokine resistin, which positively correlates with mouse IR, are 
reduced in the aged fTyeg knockout mice'®”® (Fig. 3). Furthermore, 
aged fT,-¢ knockout mice present with decreased hepatic steatosis, as 
determined histologically and by decreased fasting hepatic and serum 
triglyceride content (Fig. 3l-n). In combination, these findings suggest 
that the loss of fTyeg cells in adipose tissue alleviates many of the indi- 
cations of age-associated IR in mice, a primary clinical manifestation 
of metabolic ageing. 

To associate fT;2g cells more directly with age-associated IR, we 
measured basal glucose uptake in adipose tissue ex vivo. Notably, VAT 
from fTyeg knockout mice took up almost twice the amount of glu- 
cose as control tissue (Fig. 3k). Conversely, expansion of fT,eg cells in 
wild-type mice via treatment with IL-2—IL-2-monoclonal-antibody 
complex”! abrogates basal glucose uptake in VAT by ~50% (Fig. 30, p). 
This inverse correlation between fT;.g cell numbers and glucose uptake 
in adipose tissue supports a causal association between fTyeg cells and 
IR during ageing. 

Our findings of an association between fTreg cells and age-associated 
IR and metabolic ageing suggest that these cells are functionally 
distinct from splenic Treg cells. To investigate this notion, we com- 
pared the transcriptomes of Tyeg cells, as well as conventional CD4* 
T (Tcony) cells, isolated from VAT and spleen. Comparative analyses 
revealed that while certain canonical genes are similarly expressed 
(for example, Foxp3, Ctla4 and Tigit), VAT and splenic T,2g cells have 
discrete expression signatures, consistent with the suggested func- 
tional distinction. In particular, Pparg, Gata3 and Irf4 are selectively 
enriched in VAT but not splenic Treg cells” (Extended Data Fig. 8a). 
Furthermore, unbiased comparative gene expression analyses com- 
bined with hierarchical clustering defined extensive fat- and splenic- 
residence clusters (1,142 and 1,431 genes, respectively) relative to 
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Figure 3 | Loss of fT cg cells protects against the clinical hallmarks of 
age-associated IR. a, fTyeg levels in VAT from control and fT,eg knockout 
(Foxp3“* Pparg"") mice in young (12 weeks, control n = 6; fTreg KO 
n=6), aged (~15 months, control n = 10; fT,eg KO n= 10) and obese 
(control n= 6; fTreg KO n =8) cohorts. b, Fasting serum glucose (b) and 
insulin levels (c) in control and fT,.g knockout mice in young (12 weeks, 
control n= 9; fTyeg KO n= 9), aged (36 weeks, control n = 9; fTreg KO 
n=11) and obese (control n= 10; fTreg KO n = 10) cohorts. df, Glucose 
(left) and insulin (right) tolerance tests of control and fTreg knockout mice 
in young (12 weeks, control n = 8; fTreg KO n=8) (d), aged (36-37 weeks, 
control n = 8; fT,eg KO n= 9) (e) and obese (control n = 9; fTyeg KO n= 10) 
(f) cohorts. g, Representative haematoxylin and eosin (H&E) staining of 
VAT (epididymal) from ~14-month-old control (nm = 3) and fTreg knockout 
mice (n=5). Scale bars, 501m. h, Box and whisker plot of adipocyte size 
distribution in VAT from control (n = 3) and fT,eg knockout mice (n = 3) 


much smaller pan-Tyeg clusters 1 and 2 (56 and 162 genes, respec- 
tively). Transcriptionally, fTreg cells cluster more closely with fat Teony 
cells than splenic T,eg cells (Fig. 4a), suggesting that the functional 
specification of fT, 2g cells is informed by their anatomical location 
within adipose tissue, as well as the expression of the Treg cell-lineage- 
specifying transcription factor Foxp3 (refs 23, 24 and Fig. 4b). 
Importantly, aged fT, eg cells maintain their suppressive functionality 
as measured by in vitro suppression assays (Fig. 4c, d), and indicated 
by the high expression levels of Ctla4 (ref. 25), [l2ra (ref. 26), and the 
anti-inflammatory cytokine 110 (Fig. 4b). We posit that the transcrip- 
tional differences between fT;eg cells and splenic Treg cells (found in 
the fTreg cell cluster of 1,049 genes) may provide a therapeutic route to 
manipulate fTyeg cell populations selectively. The IL-33 receptor ST2, 
which lies within the fTyeg cell cluster, has been recently implicated 
in effector Teg and in particular fTyeg cell development?””®. Indeed, 
ST2 was ~60 and ~30 times more highly expressed in fTyeg cells than 
in splenic Tyeg and fat Tony cells, respectively, consistent with the 
ImmGen database (http://www.immgen.org; Fig. 4e and Extended 

Data Fig. 8b). Flow cytometry confirmed that ST2 is expressed on the 

cell surface of most fTyeg cells, but on relatively few fat Tcony cells, or 

splenic Tyeg or Trony cells (Fig. 4f, g). Furthermore, VAT has ~25 times 

more ST2* fTyeg cells than ST2* fat T.ony cells; a similarly trending 

~10 times difference is observed in the spleen (Fig. 4h). 
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(~14 months). i, Ad-libitum-fed serum non-esterified fatty acid (NEFA) 
levels in ~14-month-old control (n = 9) and fTyeg knockout mice (n= 10). 
j, Serum resistin levels in ~14-month fasted control and fTreg knockout 
mice (n= 4 pooled samples (2 mice per sample) per group). k, Post- 
prandial glucose uptake in VAT of aged control (n= 5) and fT;eg knockout 
mice (n = 4). 1, Representative H&E staining of liver from ~14-month-old 
control (n= 3) and fTyeg knockout mice (n = 5). Scale bars, 200 j1m. 

m, Hepatic triglyceride levels in ~14-month-old control (n= 5) and fTreg 
knockout mice (n = 3). n, Fasting serum triglycerides in ~14-month-old 
control (n= 9) and fT,g knockout mice (n = 10). 0, fTreg cells, expressed 
as percentage of total CD45.2° cells, in control (PBS) and IL-2-anti-IL-2- 
treated mice (n =3 mice per group). p, Relative glucose uptake in VAT of 
16-week-old control and IL-2-anti-IL-2-treated mice (n = 4 mice 

per group). Data are mean + s.e.m. * P< 0.05, **P<0.01, ***P < 0.001, 
#2 P< 0.0001, Student’s t-test. 


To explore the therapeutic potential of the IL-33/ST2 signalling 
pathway, aged mice were initially injected with IL-33 (0.5 1g intra- 
peritoneally (i.p.) on days 0, 2 and 4, analysis on day 6) to expand the 
fT,eg cell population (Fig. 4i-k). In agreement with fT,.g cell expansion 
driven by IL-2-anti-IL-2 treatment, mice injected with IL-33 display 
signs of IR (basal glucose uptake in VAT reduced to ~60% of con- 
trol mice; Fig. 41). In the converse approach, acute treatment with an 
anti-ST2 antibody (200 1g per mouse i.p. on days 0 and 2, analysis 
on day 3) is able to significantly deplete fTreg cells (~50% reduction), 
with a smaller percentage reduction of splenic Tyeg cells (Fig. 4m). 
Notably, the partial depletion of fT,cg cells achieved with acute anti-ST2 
treatment coincides with an increase in insulin-stimulated glucose 
uptake in VAT (~25% increase in glucose uptake compared to control 
treated mice; Fig. 4n), suggesting a link between fT, 2g cell depletion 
and increased adipose insulin sensitivity. Furthermore, this increase 
in insulin sensitivity is achieved without any signs of Tcony cell acti- 
vation associated with systemic Treg cell dysfunction (Extended Data 
Fig. 8c-e). 

As obesity and ageing often associate in humans, we challenged 
aged fTyeg knockout and control mice with HFD. While initially 
protected against HFD-induced weight gain and associated metabolic 
dysregulation, the metabolic benefits attributed to the loss of fT;eg cells 
were progressively lost over 8 weeks (Extended Data Fig. 9a-e), further 
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Figure 4 | fT,eg cell depletion improves adipose glucose uptake. 

a, Hierarchical clustering of differentially expressed genes between fat Tyeg 
and Teony and splenic Tyeg and Teony Cells from Foxp3™Y! mice (47 weeks, 
cells pooled from 3-4 mice, same data set used in b, e). b, Fragments 

per kilobase of transcripts per million mapped reads (FPKM) values of 
selected genes important for T;eg cell identity and canonical suppressive 
function. c, d, In vitro suppression assay of fTyeg cells (pooled from 
retired breeders, added at 1:1 ratio with splenic Tyony cells, conducted 

in triplicate). c, Representative carboxyfluorescein succinimidyl ester 
(CFSE) tracings of Trony cells with or without fTreg cells. Gating indicates 
percentage of dividing cells. d, Expansion index of Tony cells. e, Fold 
change in expression levels of differentially expressed genes across fat 
Treg and Teony and splenic Treg and Teony cells. Fat Treg cell cluster genes are 
labelled in red. Position of St2 (also known as I/Irl1) is marked. 


suggesting that distinct pathophysiologies drive obesity- versus age- 
associated IR. 

Taken together, our data provide evidence that distinct adipo- 
immune populations orchestrate unique features of age- and 
obesity-associated IR. We show in the non-obese setting that Pparg- 
positive fT reg cells accumulate to unusually high levels (6.7%) as a 
function of age, exacerbating both the decline of adipose metabolic 
function as well as the rise in IR (Fig. 40). These results are in marked 
contrast to the increased role of M1 adipose tissue macrophages in 
metabolic dysfunction linked to obesity, coupled with a suppression 
of fT eg levels to 0.9%. Thus, these studies highlight contrasting roles 
of the immune compartment in contributing to key aspects of adipose 
health and disease. 

Given the classical immune suppressive and anti-inflammatory 
nature of Tyeg cells, we speculate that the chronic inflammatory pro- 
cesses that drive obesity-associated IR seem unlikely to be driving 
age-associated IR. Indeed, there is increasing appreciation that main- 
taining a certain degree of inflammation is beneficial for adipose tissue 
remodelling and its metabolic function”. Failure to preserve an optimal 
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f, g, Representative FACS analysis (f) and quantification of ST2 expression 
(g) in CD4* T cells from aged mice (45 weeks, n=5 mice). h, Total 
number of ST2* Treg and Teony cells per gram of tissue in VAT and spleen 
(n=5 mice). i-k, FACS histograms (i) and quantification (j) of Treg cells 
(percentage Foxp3+ of CD45* CD4* population), and cells per gram 

of tissue in VAT and spleen after IL-33 or PBS treatment (k) (16 weeks, 
n=5 mice per group). 1, Ex vivo glucose uptake in VAT from wild-type 
mice after control or IL-33 treatment (16 weeks, n =5 mice per group). 
m, Quantification of fT,eg and splenic Tyeg cells (percentage Foxp3* of 
CD45 population) (m) and ex vivo insulin-stimulated glucose uptake (n) 
in VAT from wild-type mice after anti-ST2 depleting antibody or isotype 
control treatment (~45 weeks, n= 4 mice per group). 0, Adipo-immune 
model of metabolic ageing. Data are mean + s.e.m. * P< 0.05, **P<0.01, 
***P < 0.001, ****P < 0.0001, &P= 0.053, %P =0.056, Student's t-test. 


immune state in the aged adipose tissue may directly contribute to 
metabolic disorders such as IR and age-associated diabetes. We suggest 
type IV diabetes as a designation for non-obese-dependent fT;eg-driven 
metabolic disease of the elderly. 

In this context, it is of particular significance that fTyeg cells in aged 
adipose tissue express the cytokine receptor ST2 at ~30-60-fold higher 
levels than in other sites such as spleen, making the fT, cell popula- 
tion sensitive to depletion via anti-ST2 treatment. While ST2 has been 
implicated in other physiological processes and immune cell types that 
may also affect glucose homeostasis, this simple fT;eg cell depletion 
approach increases adipose tissue insulin sensitivity suggesting the 
potential of selective fT, cell depletion therapy in the prevention of 
age-related IR. 


Online Content Methods, along with any additional Extended Data display items and 
Source Data, are available in the online version of the paper; references unique to 
these sections appear only in the online paper. 
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METHODS 


Mice. All mice were housed in the specific pathogen-free facilities at The 
Salk Institute for Biological Studies or purchased from Taconic Biosciences. 
C57BL/6NTac mice were purchased from Taconic Biosciences for com- 
parative AIPs or studies that required wild-type aged mice. Age-matched 
retired breeders were purchased for AIPs of aged adipose tissue, and diet- 
induced obese C57BL/6NTac mice were purchased for profiling of obese 
adipose tissue. fT;eg knockout mice were generated by crossing Foxp3°" 
(Foxp3-IRES-YFP-Cre)*° and Pparg™" (ref. 31) mice. We used the Foxp3™y!! 
(ref. 32) reporter mice when isolating Treg and Teony CD4* cells from spleen and 
fat for subsequent RNA-Seq analysis. Mice within The Salk Institute for Biological 
Studies received autoclaved normal chow (MI laboratory rodent diet 5001, Harlan 
Teklad), irradiated HFD (60 kcal% fat, Research Diets), or irradiated HFD with 
rosiglitazone (30 mg kg”! of food, Research Diets). All mice used for studies were 
male. All procedures involving animals were performed in accordance with proto- 
cols approved by the Institutional Animal Care and Use Committee (LACUC) and 
Animal Resources Department (ARD) of the Salk Institute for Biological Studies. 
AIPs. Visceral (epididymal) and subcutaneous (inguinal) adipose depots were 
dissected from mice after 10 ml PBS perfusion through the left ventricle. Inguinal 
lymph nodes resident in inguinal adipose tissue were removed. Adipose tissue 
was minced into fine pieces (2-5 mm®) and digested in stromal vascular isolation 
buffer (100 mM HEPES, pH 7.4, 120 mM NaCl, 50mM KCI, 5mM glucose, 1 mM 
CaCl, and 1.5% BSA) containing 1 mg ml’ collagenase at 37°C with intermittent 
shaking for 1.5h. The suspension was then passed through a 100-j1m mesh to 
remove undigested clumps and debris. The flow-through was allowed to stand for 
10 min to separate the floating adipocyte fraction and infranatant containing the 
stromal vascular fraction. The infranatant was removed while minimally disturb- 
ing the floating adipocyte fraction and centrifuged at 400g for 10 min. The pellet 
containing the stromal vascular fraction was washed once in 10 ml RPMI. The 
resultant isolated cells were subjected to FACS analysis. The following antibodies 
were used to assemble the adipo-immune profile. BioLegend: CD45.2 (104), CD44 
(IM7), CD62L (MEL-14), TCRg/d (GL3), CD19 (6D5), CD25 (PC61), CD206 
(C068C2), CD301 (LOM-14); eBioscience: CD3 (145-2C11), CD25 (PC61), 
CD4 (RM4-5), TCRb (H57-597), B220 (RA3-6B2), NK1.1 (PK136), CD49b 
(DX5), Foxp3 (FJK-16s), F4/80 (BM8), CD11c (N418), CD11b (M1/70); Tonbo 
biosciences: F4/80 (BM8.1), CD4 (RM4-5), CD44 (IM7), CD62L (MEL-14), Ly6G 
(RB6-8C5); BD Pharmingen: Siglec-F (E50-2440); BD Biosciences: CD8a (53-6.7). 
Cells were analysed using the BD FACSAria instrument (Becton Dickinson) and 
FlowJo software (Tree Star). 

Body composition and adipocyte size analyses. Body composition was measured 
with an Echo MRI-100 body composition analyser (Echo Medical Systems). VAT 
(epididymal) was dissected, and the wet weight was determined. Adipose tissues 
were fixed in 10% formalin, sectioned, and stained in haematoxylin and eosin. An 
adipocyte cross-sectional area was determined from photomicrographs of VAT 
using Image]. 

In vivo metabolic phenotype analysis. Real-time metabolic analyses were 
conducted in a Comprehensive Lab Animal Monitoring System (Columbus 
Instruments). CO, production, O2 consumption and ambulatory counts were 
determined for at least three consecutive days and nights after at least 24h for 
adaptation before data recording. 

Principal component analysis of AIPs. Non-macrophage immune cell 
populations, described as percentage of the total CD45.2* immune compartment, 
were inputted into MetaboAnalyst 3.0 (http://www.metaboanalyst.ca/ 
MetaboAnalyst/) for PCA. No normalizations, transformations, or scalings were 
implemented. 

Glucose homeostasis studies. Fasting was induced for 6h, except for glucose 
tolerance tests, which were conducted after overnight fasting. Glucose (1-2g kg”, 
ip.) and insulin (0.5-1.0 U kg", i.p.) was injected for glucose tolerance tests and 
insulin tolerance tests, respectively. Blood glucose was monitored using a Nova 
Max Plus glucometer. 

Histological analyses. Sections (41m) of fixed tissues were stained with 
haematoxylin and eosin according to standard procedures. Histopathological 
scores were graded on blinded samples for severity and extent of inflammation 
and morphological changes by a pathologist. 

Serum analyses. Blood was collected by tail bleeding or right atrial puncture. 
Non-esterified fatty acids (Wako) and triglycerides (Thermo) were measured 
using colorimetric methods. Serum insulin levels (Ultra Sensitive Insulin, Crystal 
Chem) were measured by ELISAs. Serum cytokine and metabolic hormone levels 
were analysed by the Luminex Bio-Plex system using the Mouse Cytokine 23-Plex 
Panel and Diabetes Panel, respectively, according to the manufacturer’s instructions 
(Bio-Rad). 

Core body temperature. Mice were single housed, and core body temperature was 
measured with a clinical rectal thermometer (Thermalert model TH-5; Physitemp) 


during the middle of the light cycle. The probe was dipped in a room temperature 
lubricating glycerol before insertion. 

Ex vivo 2-DG uptake assays. Adipose tissue was dissected from mouse, cut into 
small pieces with scissors, washed and incubated for 30 min with Krebs-Ringer 
Bicarbonate HEPES buffer (KRBH, 120mM NaCl, 4mM KH»PO,, 1mM MgSO,, 
0.75 mM CaCh, 30 mM HEPES and 10mM NaHCOs, pH 7.4, supplemented with 
1% fatty-acid-free BSA). For determination of exogenous insulin-stimulated 
2-deoxy-p-glucose (2-DG) uptake, adipose was incubated in KRBH with 100- 
200nM insulin for 20 min at 37°C. Cold 2-DG and hot 2-DG-1,2-H(N) was added 
to incubated adipose tissue such that the final concentration of cold 2-DG was 
0.1mM and final quantity of hot 2-DG-1,2-7H(N) was 0.1 1Ci (assuming total 
reaction volume ~400 11). Adipose was further incubated for 20 min at 37°C, 
then washed three times with PBS before being lysed by scintillation fluid. 2-DG 
uptake was determined by measuring scintillation counts normalized to adipose 
tissue mass used for assay. Nonspecific 2-DG uptake levels were determined by 
co-treating adipose tissue with cytochalasin B (0.1|1M final concentration) with 
the addition of cold and hot 2-DG. 

IL-2-anti-IL-2 complex and IL-33 injections. IL-2-anti-IL-2 complexes were 
prepared by incubating 21g of mouse IL-2 (Biolegend) with 101g of anti-IL-2 
antibody (JES6.1, Bioxcell) in a total volume of 200,11 PBS for 30 min at 37°C 
(amounts given per injection). Mice were injected i.p. three times (days 0, 1 and 
2) and analysed on day 8. For IL-33 expansion assays, mice were injected i-p. with 
0.5 1g of recombinant mouse IL-33 in PBS (R&D systems) three times (days 0, 2 
and 4) and analysed on day 6. PBS was used for control injections. 

RNA-Seq library generation. Total RNA was isolated from sorted cells or whole 
tissues using TRIzol reagent (Invitrogen) as per the manufacturer's instructions 
and treated with DNasel (Qiagen) for 30 min at 22°C. Sequencing libraries were 
prepared from 10-100 ng of total RNA using the TruSeq RNA sample prepara- 
tion kit v2 (Illumina) according to the manufacturer’s protocol. In brief, mRNA 
was purified, fragmented and used for first- and second-strand cDNA synthesis 
followed by adenylation of 3’ ends. Samples were ligated to unique adaptors 
and subjected to PCR amplification. Libraries were then validated using the 
2100 BioAnalyzer (Agilent), normalized and pooled for sequencing. RNA-Seq 
libraries prepared from two biological replicates for each experimental condition 
were sequenced on the Illumina HiSeq 2500 using barcoded multiplexing and a 
100-bp read length. 

High-throughput sequencing and analysis. Image analysis and base calling 
were done with Illumina CASAVA-1.8.2. This yielded a median of 29.9 M usable 
reads per sample. Short read sequences were mapped to a UCSC mm9 reference 
sequence using the RNA-Seq aligner STAR*’. Known splice junctions from mm9 
were supplied to the aligner and de novo junction discovery was also permitted. 
Differential gene expression analysis, statistical testing and annotation were per- 
formed using Cuffdiff 2 (ref. 34). Transcript expression was calculated as gene-level 
relative abundance in fragments per kilobase of exon model per million mapped 
fragments and employed correction for transcript abundance bias*°. RNA-Seq 
results for genes of interest were also explored visually using the UCSC Genome 
Browser. 

Hierarchical clustering. Differentially expressed gene names and corresponding 
FPKM values across samples were inputted into GENE-E (Broad Institute) for 
hierarchical clustering analysis (implemented one minus pearson correlation 
for sample and gene distance metrics and the average linkage method) and 
visualization. Gene cluster names were created to describe the gene expression 
characteristics within each cluster (that is, fat-residence cluster refers to the gene 
cluster in which genes are expressed at greater levels in T cells residing in fat. 
Fat-Tyeg cluster refers to the gene cluster in which genes are expressed highest in 
only the fTyeg cells). 

In vitro suppression assay. fT,.g cells were isolated from aged wild-type mice 
treated with IL-2-anti-IL-2 complexes to expand fTyeg cell numbers, as described 
above with isolation conducted on day 6. Stromal vascular fractions were isolated 
from VAT as described above, and fTreg cells were sorted from stromal vascular 
fractions using the BD FACSAria instrument (Becton Dickinson), gating on 
CD45.2+ CD4* CD25* cells. CD45.1* mice were used to isolate splenic responder 
T cells, which were purified by positive selection using CD4-specific Dynabeads 
(Invitrogen), followed by sorting on a BD FACSAria cell sorter, gating on 
CD45.1* CD4+ CD25~ CD62L' CD44" cells. Antigen-presenting cells were 
prepared from wild-type B6 splenocytes by T-cell depletion using Thy1- 
specific MACS beads. CFSE labelled effector T cells (5 x 10* cells well~!) were 
co-cultured with fTyeg cells at the indicated ratio in the presence of irradiated 
(30 Gy) antigen-presenting cells (1 x 10° cells well!) in 96-well plates in com- 
plete RPMI1640 medium supplemented with 10% FBS and CD3 antibody 
(1pg ml~!). Cell proliferation and expansion index were determined 96h 
later using the BD FACSAria instrument and analysed with the FlowJo software 
package (Tree Star). 
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ST2 studies and anti-ST2-depleting antibody treatment. Antibody for ST2 FACS 
analysis was purchased from MD Bioproducts, clone DJ8. For fT;eg cell depletion, 
mice were injected i.p. with 200 1g anti-ST2 antibodies** (R&D systems, clone 
245707) or isotype control (Bioxcell) twice (days 0 and 2) and euthanized for 
analysis on day 3. 

Statistical analyses. Statistical analyses were performed with Prism 6.0 
(GraphPad). P values were calculated using two-tailed unpaired or paired 
Student’s t-test. When analysing AIPs, we used a false discovery rate approach 
to avoid the problem of an inflated false positive rate due to the substantial num- 
ber of hypothesis tests. Mice cohort size was designed to be sufficient to enable 
statistical significance to be accurately determined. When applicable, mice were 
randomly assigned to treatment or control groups. No animals were excluded 
from the statistical analysis, with the exception of exclusions due to technical 
errors, and the investigators were not blinded in the studies. Appropriate statis- 
tical analyses were applied, assuming a normal sample distribution, as specified 
in the figure legends. No estimate of variance was made between each group. All 
in vivo metabolic and glucose homeostasis experiments, ex vivo glucose uptake 
experiments, and AIP experiments were conducted with at least two independent 
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cohorts. RNA-Seq experiments, Luminex profiling and histological analyses were 
conducted using multiple biological samples (as indicated in figure legends) from 
indicated cohorts. 
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study premise. b-d, AIPs were generated through the use of several macrophage subsets (c) and eosinophils and neutrophils (d). 


distinct antibody cocktails. Here, using Foxp3“* (Foxp3-IRES-YFP-Cre) 
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further divided into pan-macrophage (middle) and non-macrophage 
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Extended Data Figure 3 | Aged fT,,-g knockout mice do not show signs 
of systemic autoimmunity or breakdown in peripheral tolerance. 

a, Percentage of splenic naive CD4* T cells as defined by CD62! CD44"° 
relative to total CD4* CD25~ Foxp3“" population (n = 9 mice per group). 


b, Representative histology of gastrointestinal tract—duodenum, jejunum, 


ileum and colon (left to right) (1 =3 mice per group). There were no 
significant lesions observed or differences in inflammation, epithelial 
changes, or mucosal architecture between the two groups (H&E, original 
magnification, x 100). Scale bar, 50 um. c, Histopathology score in the 
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small intestine and colon of fT;eg knockout and control mice. The severity 
and extent of inflammation and epithelial changes as well as mucosal 
architecture were each graded on a score of 1 (minimal) to 

5 (severe) and added to obtain an overall score over 20. There were 
minimal inflammatory changes with no significant differences between 
groups. d, Multiplex inflammation panel of serum from fT,.g knockout 
and control mice (1 = 4 pooled samples (3 mice per sample) per group). 
Data are mean £s.e.m. *P < 0.05, ***P < 0.001, Student’s t-test. 
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Extended Data Figure 4 | Weight-matched cohorts’ body weights and 
adipocyte size frequency in VAT of aged control and fT,., knockout 
mice. a, Body weights of fT;eg knockout and control male mice used in 
weight-matched metabolic studies in young (12 week; control n = 9; fTreg 
KO n=9), aged (36 week; control, n = 9 mice; fTreg KO, n= 11 mice) and 
obese (diet-induced obese, 12 weeks of HFD starting at 12 weeks; control 
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medium (5,000-10,000 pm”) and large (> 10,000 um”) adipocytes in 

VAT of aged control and fT eg knockout mice (n = 3 mice per group, 850 
adipocytes counted from control mice, 269 adipocytes counted from fTyeg 
knockout adipose). Data are mean + s.e.m. 
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Extended Data Figure 5 | VAT AIPs of obese fT;.g knockout and control 
mice. a, AIPs of diet-induced obese (16 weeks high fat diet started at 

12 weeks) control (1 =6 mice) and fT; eg knockout (1 = 8 mice) male mice 
depicting immune cell abundance, expressed as percentage of CD45.2+ 
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Extended Data Figure 6 | fT,eg cells are dispensable for TZDs to 

exert their therapeutic insulin-sensitizing effect. a, Scheme used for 
longitudinal interventional study of control and fTyeg knockout mice which 
indicates when particular assays were conducted and whose results are 
described in b-g, in which rosiglitazone (Rosi) was introduced in diet after 
firmly establishing obesity with a HFD alone for 12 weeks (n= 8 mice per 
group). b, Cohort weights during course of study. Black arrow indicates 
introduction of rosiglitazone to the diet. c, Homeostatic model assessment 
of IR (HOMA-IR). d, e, Glucose tolerance test (d) and glucose excursions 
of glucose tolerance test (e) described as area under curve (AUC). f, g, 
Insulin tolerance test (f) and bar-graph quantitation of relative serum 
glucose decrease during insulin tolerance test (g) described as area above 
curve (AAC). h, Scheme used for parallel prophylactic study of control 
and fTieg knockout mice, the results of which are described in i-1, in which 


Minutes 
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mice were placed on a HED or HFD with rosiglitazone for 12 weeks (n= 8 
mice per group). i, Cohort weights at end of study. j, HOMA-IR. k, 1, 
Glucose and insulin tolerance tests of control (k) or fTreg knockout (1) 
mice fed HFD or HED with rosiglitazone. m, Scheme used to determine 
temporal relationship of TZD-induced fT,-g expansion and TZD-induced 
insulin-sensitization in wild-type mice, the results of which are described 
in n-q, where mice were fed HFD or HED with rosiglitazone for up to 

11 weeks (n = 10 mice per group, 5 mice of each group were euthanized 
at 5 weeks after diet introduction and remaining 5 mice were euthanized 
at 11 weeks). n, HOMA-IR at 4 weeks. 0, p, Glucose (0) and insulin (p) 
tolerance tests at 5 weeks. q, Relative fTyeg cell enrichment of mice fed 
HED with rosiglitazone versus mice fed HFD alone at 5 and at 11 weeks. 
Data are mean +s.e.m. *P< 0.05, **P< 0.01, ***P< 0.001, Student’s t- 
test. 
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Extended Data Figure 7 | Increased TNFa levels and gene expression 
pattern of aged fT,.. knockout adipose tissue is consistent with an 
improved adipose remodelling capacity. a, TNFa levels quantified by 
ELISA of whole adipose lysate (~40 weeks, n =6 per group). b-d, FPKM 
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in VAT from aged fT;eg knockout and control mice (~40 weeks, n = 3 mice 
per group). Data are mean + s.e.m. ***P < 0.001, Student's t-test. 
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Extended Data Table 1 | Antibodies used to identify the given 


immune cell type molecularly 


Immune Cell Type 


Molecular Identification Scheme 


TCRyS CD45.2° F4/80 CD3* TCRB TCRy* 

cps* CD45.2° F4/80_CD3* TCRB* CD4_CD8* 
Treg CD4* CD45.2° CD4* CD25" Foxp3* 

Naive CD4* cp45.2* cp4* cb25" Foxp3” cp62L"™ cp44° 


Activated CD4* 


cp45.2* cp4* CD25" Foxp3. cD62L"° cp44"™ 


NKT CD45.2* NK1.1* TCRB 

NK CD45.2* NK1.1* TCRB_ 

B CD45.2* NK1.1_ CD19* 

Eosinophil CD45.2" F4/80" Siglec-F* 

Neutrophil CD45.2* F4/80_ CD11c CD11b* Ly6G* 
M2 ATM cp45.2* Fa/g0* cD11c™* cp206* 

M1 ATM cD45.2* F4/80° CD11c" CD206~ 


DN (Double-negative) ATM 


CD45.2" F4/80* CD11c_CD206— 
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Genome-wide detection of DNase I hypersensitive 
Sites in single cells and FFPE tissue samples 


Wenfei Jin!*, Qingsong Tang, Mimi Wan’, Kairong Cui!, Yi Zhang!’, Gang Ren!’, Bing Ni’, Jeffrey Sklar’, Teresa M. Przytycka®, 


Richard Childs®, David Levens’ & Keji Zhao! 


DNase I hypersensitive sites (DHSs) provide important information 
on the presence of transcriptional regulatory elements and the state 
of chromatin in mammalian cells'*. Conventional DNase sequencing 
(DNase-seq) for genome-wide DHSs profiling is limited by the 
requirement of millions of cells*>. Here we report an ultrasensitive 
strategy, called single-cell DNase sequencing (scDNase-seq) for 
detection of genome-wide DHSs in single cells. We show that DHS 
patterns at the single-cell level are highly reproducible among 
individual cells. Among different single cells, highly expressed gene 
promoters and enhancers associated with multiple active histone 
modifications display constitutive DHS whereas chromatin regions 
with fewer histone modifications exhibit high variation of DHS. 
Furthermore, the single-cell DHSs predict enhancers that regulate cell- 
specific gene expression programs and the cell-to-cell variations of DHS 
are predictive of gene expression. Finally, we apply scDNase-seq to pools 


of tumour cells and pools of normal cells, dissected from formalin-fixed 
paraffin-embedded tissue slides from patients with thyroid cancer, and 
detect thousands of tumour-specific DHSs. Many of these DHSs are 
associated with promoters and enhancers critically involved in cancer 
development. Analysis of the DHS sequences uncovers one mutation 
(chr18: 52417839G>C) in the tumour cells of a patient with follicular 
thyroid carcinoma, which affects the binding of the tumour suppressor 
protein p53 and correlates with decreased expression of its target gene 
TXNL1. In conclusion, scDNase-seq can reliably detect DHSs in single 
cells, greatly extending the range of applications of DHS analysis 
both for basic and for translational research, and may provide critical 
information for personalized medicine. 

We developed a circular carrier DNA-mediated sequencing method, 
called scDNase-seq, to analyse genome-wide DHSs in a few cells or 
even single cells (Fig. 1a). Application of scDNase-seq to NIH3T3 cells 


a Schema of scDNase-seq b — Figure 1 | Genome-wide detection of DHSs 
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DNA and sequencing. b, Genome Browser 
displays showing the DHS in ENCODE data 
and small cell number scDNase-seq data (black 
tracks). The red tracks show scDNase-seq read 
densities in DHSs of 5 single NIH3T3 cells and 
14 single mouse ESCs. c-f, Scatter plots showing 
the tag density correlation of DHSs between two 
libraries. Each dot represents one or more DHSs. 
g-i, Venn diagrams showing the significant 
overlaps of DHSs between two libraries. 
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Figure 2 | Detectability of single-cell DHSs is positively correlated 
with gene expression and number of active histone modifications. 

a, Number of tags within +1 kilobase (kb) of TSSs correlated with higher 
gene expression in single cell number 1. b, scDNase-seq tag density in 
single cell 1 is positively correlated with gene expression in a population 
of cells. c, The proportion of open promoters detected by scDNase-seq in 
single cell 1 is positively correlated with gene expression. d, Housekeeping 
genes (red) show higher tag density and lower variation than tissue- 
specific genes (green). e, Genes with open promoter in more single cells 
are associated with higher expression levels. f, The percentage of overlaps 
between DHSs detected in 1,000 NIH3T3 cells and single cell 1 positively 
correlated with gene expression. The total number of genes with DHSs for 


generated DHS profiles of 10,000, 1,000, 100 and even single cells com- 
parable to that of mouse ENCODE data obtained from 10 million to 
20 million cells (Fig. 1b). On average, about 317,000 unique 
scDNase-seq reads and 38,000 DHSs were detected per single cell. 
Although the numbers of mapped reads and DHSs decrease as the cell 
numbers decrease, the enrichments of reads in DHSs in different single 
cells were very similar (23-26% of reads in DHS regions), despite minor 
differences (Supplementary Tables 1-3). Scatter plot analysis indicated 
that the DHSs from 10,000, 1,000 and 100 cells are as reproducible 
as the ENCODE data (Fig. 1c, d and Extended Data Fig. la—c). The 
pooled DHSs of five single NIH3T3 cells were significantly correlated 
with those of 1,000 cells (Fig. le). We also observed high correlation of 
DHSs between single cells (Fig. 1fand Extended Data Fig. 1d-l). Venn 
diagrams showed that at least 90% of DHSs in single cells could be 
detected in the 1,000-cells data (Fig. 1g and Extended Data Fig. 2a-d). 
Large fractions (41-82%) of DHSs were shared between two single cells 
(Fig. 1h and Extended Data Fig. 2e-m). Although only 35-59% of the 
DHS in the 1,000-cells data were detected in each single cell (Fig. 1g 
and Extended Data Fig. 2a—d), detectability increased to 72% when the 
five single cells were pooled (Fig. 1i), suggesting single-cell-specific 
DHSs contribute to the total number of DHSs detected in a population 
of cells. 

The false discovery rates (FDRs) of single-cell libraries were 11-13% 
(Supplementary Table 2) when one scDNase-seq tag was detected 
within a DHS region, suggesting that even detection of one tag is likely 


Tag density in ENCODE data 
each group is indicated outside the Venn diagrams. The numbers in red 
indicate the percentages of DHSs detected in single cell 1 that overlapped 
with the DHSs detected in 1,000 cells. g, Active histone modifications 
(H3K4mel, H3K4me3, H3K9ac, H3K27ac and H2A.Z) are associated 
with higher scDNase-seq tag density than the repressive H3K27me3 and 
H3K9me2 modifications in single cells. h, The H3K27ac level effectively 
predicts the detectability by scDNase-seq. i, The scDNase-seq density in 
cell 1 correlated with the number of histone active modifications. j, The 
detection of DHSs across multiple single cells is positively correlated with 
the number of histone modifications. k, The DHS detectability in single 
cells is correlated with the tag density of DHS peaks in the ENCODE data. 


to represent a true DHS. Indeed, transcription start sites (TSSs) with 
one tag exhibited significantly higher expression levels than those with- 
out any tag (Fig. 2a and Extended Data Fig. 3a—d). The tag number 
at TSSs positively correlated with expression levels when the number 
was low (zero to three tags), but expression levels did not significantly 
change when the number was high (more than three tags) (Fig. 2a and 
Extended Data Fig. 3a—d), indicating that the gene expression was no 
longer limited by accessibility once the promoter had become accessible. 
As expected, the tag density at TSSs in each single cell correlated with 
gene expression levels measured in a population of cells (Fig. 2b and 
Extended Data Fig. 3e-h), and almost all promoters of highly expressed 
genes were accessible in each single cell (Fig. 2c and Extended Data 
Fig. 3i-l). Consistent with these observations, the tag densities at 
housekeeping genes were higher and variations lower than those at 
tissue-specific genes (Fig. 2d). The number of cells where a promoter 
exhibited DHS correlated with its gene expression: the genes with DHSs 
across all five single cells had the highest expression level (Fig. 2e). 
Further analysis showed that the genes with the lowest cell-to-cell var- 
iation at promoters were significantly enriched in basic cell functions 
such as transcription, cell cycle and RNA processing (Supplementary 
Table 4). The genes with the highest cell-to-cell variation were signifi- 
cantly enriched in metal ion binding (Supplementary Table 5). 

Next we examined the fraction of overlapping open promoters where 
DHS was detected in either 1,000 cells or one single cell in different 
expression groups. The analysis revealed that although only 58-61% of 
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Figure 3 | Single-cell scDNase-seq DHS data can predict cell-specific 
enhancers. a, Genes with DHSs in fewer single cells (x axis) exhibit much 
higher variation of gene expression across different single cells (y axis). 

b, Genes with DHSs in fewer single cells (y axis) are expressed in fewer single 
cells (x axis). c, The NIH3T3- and ESC-specific DHSs identified in single 
cells showed expected cell specificity in all libraries. d, The subpeaks of 
NIH3T3-specific super-enhancers show much higher tag density in NIH3T3 
cells than that in ESCs. e, The subpeaks of ESC-specific super-enhancers 
show much higher tag density in ESCs than that in NIH3T3 cells. 


the open promoters overlapped in the silent gene group, 98-99.9% of 
the open promoters in intermediately and highly expressed gene groups 
detected in a single cell overlapped with those detected in 1,000 cells 
(Fig. 2f and Extended Data Fig. 4), indicating that the DHSs of active 
genes can be consistently detected in single cells. 

Compared with promoter/proximal DHSs, distal DHSs showed 
lower tag density, higher cell-to-cell variation and noise (Extended 
Data Fig. 5a—d). Nevertheless, distal DHSs in single cells were clearly 
enriched in active histone modifications (H3K4mel, H3K4me3, 
H3K9ac, H3K27ac and H2A.Z) but not repressive ones (H3K36me3, 
H3K9me2 and H3K27me3) (Fig. 2g), which is consistent with the 
scenario at the population level®!! and validated our single-cell assay. 
Interestingly, DHS detectability in single cells correlated with the degree 
of enrichment of the active histone modification (Fig. 2h and Extended 
Data Fig. 5e-h), and correlated with the number of active marks at 
the DHSs (Fig. 2i and Extended Data Fig. 5i-l). The vast majority of 
DHSs were detected across all five single cells when five active histone 
modification marks were present, whereas DHSs exhibited in variable 
number of cells when only one or two active marks were present 
(Fig. 2j). These results indicate that DHS at enhancers are variable 
between different cells and provide strong evidence that multiple active 
histone modifications strongly correlated with chromatin accessibility 
across different single cells. 

We compared the DHSs detectability in single cells with the tag 
density of DHSs from 1,000 cells or 20 million cells. The results 
indicated the detectability of DHSs in single cells positively corre- 
lated with the tag density from the library by a large number of cells 
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(Fig. 2k and Extended Data Fig. 5m). We hypothesized that strong 
DHSs would be present in all the cells and weak DHSs in only a 
fraction. If this were the case, more strong DHSs and fewer weak 
DHSs should be detected within one single cell. Indeed, 80-90% of 
the strong DHSs were detected whereas only 20-30% of weak DHSs 
were detected in single cells (Fig. 2k and Extended Data Fig. 5m). 
Another prediction from this hypothesis was that relatively fewer 
strong DHSs and more weak DHSs would be additionally detected as 
we added up single cells. Pooling the five single cells indeed showed 
the fraction of detected weak DHSs was doubled, whereas the fraction 
of detected strong DHSs only increased by a small percentage (Fig. 2k 
and Extended Data Fig. 5m). 

The variation of DHSs among single cells within a ‘homogenous’ 
population is reminiscent of the well-known phenomenon of variation 
of gene expression among single cells!*. To study their relationship, we 
constructed 14 single embryonic stem-cell (ESC) scDNase-seq librar- 
ies (Supplementary Tables 6-8). Comparison with single-cell RNA 
sequencing (RNA-seq) data! revealed that tag density and variation at 
TSS of single-cell scDNase-seq indeed correlated with that of single-cell 
gene expression (Extended Data Fig. 6a, b). Furthermore, the genes 
with DHSs in fewer single cells showed high variation of expression and 
were expressed in fewer single cells (Fig. 3a, b). These results further 
indicate that the cell-to-cell variations of single-cell DHSs are predictive 
of gene expression. Consistent with this notion, we found a signifi- 
cantly higher correlation between the technical repeats compared with 
that of two non-technical repeat libraries (Extended Data Fig. 6c, d). 
The Gene Ontology (GO) terms enriched among genes with the lowest 
and the highest cell-to-cell variation in the 14 ESCs were consistent 
with that in single 3T3 cells (Supplementary Tables 9 and 10). 

We next identified 1,735 NIH3T3-specific DHSs and 2,180 
ESC-specific DHSs using the 5 NIH3T3 and 14 ESC single-cell 
scDNase-seq libraries. Heat map showed these cell-specific DHSs 
displayed expected cell specificity in all of the libraries (Fig. 3c). The 
cell-specific DHSs were highly correlated with cell-specific gene 
expression (Extended Data Fig. 6e, f) and enriched in distinct bio- 
logical functions (Extended Data Fig. 6g, h). Super-enhancers play 
a key role in regulating expression of critical cell-specific genes'*"*. 
We identified 275 NIH3T3-specific and 231 ESC-specific super- 
enhancers and compared their single-cell scDNase-seq tag densities. 
The subpeaks of 3T3-specific super-enhancers were associated with a 
substantially higher tag density in single 3T3 cells than that in ESCs, 
and vice versa (Fig. 3d, e), indicating that single-cell DHSs can help 
predict super-enhancers. 

Chromatin defects underlie various diseases including cancers’”. 
Profiling genome-wide chromatin accessibility in cells from patients, 
which are often limiting in numbers, would be clinically invaluable. 
We applied scDNase-seq to cells dissected from a follicular thyroid 
carcinoma (FTC) sample on formalin-fixed paraffin-embedded (FFPE) 
slides (Fig. 4a). DNase I digestion resulted in typical periodic cleav- 
age patterns of nucleosome arrays and read enrichment around TSSs 
(Fig. 4b and Extended Data Fig. 7a—c). Likewise, the genome browser 
displays showed peaks (Fig. 4c), suggesting the cells recovered from the 
FFPE slides retained key chromatin features. 

HMGA2is upregulated in FTC'®" and its promoter indeed exhibited 
higher accessibility in the tumour than that in adjacent normal cells 
(Fig. 4d). Overall, 1,342 tumour-specific and 2,812 normal-specific 
DHSs were identified (Extended Data Fig. 8a, b). The genes associ- 
ated with the tumour-specific DHSs were significantly enriched in 
the GO biological process terms such as regulation of GTPase acti- 
vity and response to hypoxia, and pathways such as E-cadherin sig- 
nalling, RhoA signalling, p53 pathway, RAC1 signalling and MYC 
transformation (Extended Data Fig. 8). Among these were several 
known genes such as TIAM1 and PIP4K2A (Extended Data Fig. 9a, b), 
involved in tumours”*”!, Interestingly, genes that are characteristic of 
PAX8-PPARG fusion” in FTC were enriched in tumour-specific DHSs 
(Extended Data Fig. 8f and Supplementary Table 11), even though 
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Figure 4 | Application of scDNase-seq to tissue sections from patients 
with FFPE reveals novel pathophysiological information on thyroid 
cancers. a, Full view of a slide of FTC 440 stained with haematoxylin and 
eosin. Cells recovered from the highlighted areas were subject to scDNase- 
seq analysis. b, Typical periodic DNase cleavage patterns of nucleosomes 
were detected both for normal and tumour cells by scDNase-seq. c, Genome 
Browser image displaying the scDNase-seq profiles of the normal (blue) 
and tumour (red) cells from two thyroid carcinomas 440 and 131. 

d, Genome Browser image showing the increased chromatin accessibility 
of the HMGA2 promoter in the tumour cells of FTC 440 (left). Quantitative 
reverse transcription PCR (qRT-PCR) shows the increased HMGA2 
messenger RNA (mRNA) level in the tumour cells (right). e, An SNV was 


PPARG gene rearrangement was not detected by fluorescence in situ 
hybridization of FTC 440 (data not shown). This suggests that pathways 
associated with the transcriptional regulation by PAX8-PPARG, but 
not necessarily the PAX8-PPARG rearrangement itself, are important 
in mediating follicular thyroid tumorigenesis. 

We similarly analysed samples from two more FTC (797 and 957) 
and one papillary thyroid carcinoma (PTC from patient number 131) 
samples (Supplementary Table 12). Comparison of the tumour-specific 
DHSs identified in the three FTC samples revealed very few shared 
DHSs among all three FTC samples (Extended Data Fig. 10a). The 
HMGA2 promoter exhibited a strong DHS in the tumour cells but not 
in their neighbouring normal cells in FTC from patient number 440, 
whereas in the other two FTC cases (957 and 797) the promoter showed 
strong DHSs both in tumour and in normal cells (Extended Data 
Fig. 10b). Instead, an intronic enhancer showed differential DHSs 
between the tumour and normal cells (Extended Data Fig. 10b). These 
results suggest that the mis-regulation of HMGA2 in the tumour cells 
may be attributed to different regulatory elements in different patients. 
Analysis of PTC 131 also identified numerous tumour-cell- and normal- 
cell-specific DHSs, which are enriched in disease ontologies (Extended 
Data Fig. 10c). Overall, our results indicate that the vast majority of 
DHSs are patient-specific, implying that these tumours may arise or 
progress via different mechanisms in different patients. 

To gain further mechanistic insight, we searched for genetic lesions 
within DHSs in FTC 440 by comparing the DHS sequence between 


identified at a DHS near the 3’ end of the TXNL1 gene in the tumour cells 
of FTC 440. The SNV location is indicated by the red square. The SNV was 
confirmed by Sanger sequencing (highlighted region). f, The G to C change 
in tumour cells negatively impacts the p53 target motif. The SNV in the 
p53 motif logo is indicated by a red arrowhead. g, The G to C change in the 
tumour cells is correlated with decreased expression of TXNL1 by qRT-PCR 
analysis. h, Protein p53 is bound to the SNV region in a human thyroid cell 
line by chromatin immunoprecipitation (ChIP)-—qPCR analysis. i, The 

G to C change decreases p53 binding affinity in vitro by gel shift assay. 

j, The G to C change reduces the activity of the p53 motif to activate a 
reporter promoter in vivo. The p53 motif from the p21 promoter was used 
as a positive control. 


tumour and normal cells. A total of 31 potential single nucleotide var- 
iations (SNVs) were identified in the DHS regions, which included 
both loss of heterozygosity of known SNPs and de novo mutations 
(Supplementary Table 13). We confirmed the de novo mutation 
(chr18:52417839G>C) at a DHS downstream of the thioredoxin-like 1 
gene (TXNL1) (Fig. 4e). TXNL1 encodes a regulatory subunit of the 
human 26S proteasome”. Downregulation of TXNL1 is associated with 
poor prognostic outcomes, aneuploidy in colorectal carcinoma~* and 
is implicated in cisplatin-induced apoptosis”. Interestingly, the G to C 
change appears to negatively impact the binding motif of p53 (Fig. 4f) 
and correlates with significantly decreased expression of TXNL1 in the 
tumour cells (Fig. 4g); p53 binds to this DHS in a human thyroid cell 
line (Fig. 4h). The G to C mutation at this site compromises p53 binding 
(Fig. 41) and impairs its ability to activate a reporter promoter (Fig. 4)), 
suggesting that the G to C change may underlie the decreased TXNL1 
expression in the tumour cells (Fig. 4g). This SNP was not detected in 
the other three patients (797, 957 and 131). Therefore, our strategy for 
searching SNVs in relevant DHS regions seems to be a cost-effective 
alternative to whole-genome sequencing for detecting functionally 
important mutations in regulatory regions. 
Tn5-transposase-mediated detection of chromatin accessibility 
(scATAC-seq)”°”” in a large number of single cells has been reported 
recently. However, the reads per cell generated by scATAC-seq may be 
too sparse to examine the cell-to-cell variation at individual regulatory 
regions**”’, In comparison, our scDNase-seq detects a much larger 


3 DECEMBER 2015 | VOL 528 | NATURE | 145 


© 2015 Macmillan Publishers Limited. All rights reserved 


LETTER 


number of DHSs per cell, which provides information on cell-to-cell 
variations of individual DHSs. scDNase-seq is expected to find its use 
in multiple settings, such as the analysis of rare cell populations during 
lineage development and the study of clinical samples with extremely 
small numbers of cells such as circulating tumour cells, laser-captured 
cells, core biopsy or fine-needle aspiration samples. Being able to eval- 
uate the chromatin states associated with specific diseases or develop- 
mental programs might provide valuable information for developing 
new diagnostic and therapeutic strategies for these malignancies. 


Online Content Methods, along with any additional Extended Data display items and 
Source Data, are available in the online version of the paper; references unique to 
these sections appear only in the online paper. 
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METHODS 


No statistical methods were used to predetermine sample size. The experiments 
were not randomized. The investigators were not blinded to allocation during 
experiments and outcome assessment. 

Cell culture and sorting. NIH/3T3 tet-on 3G cells (Clontech, 631197) were cul- 
tured in DMEM (Invitrogen, 10566-016) supplemented with 10% FBS (Sigma, 
F4135-500ML) and 100 U mI penicillin-streptomycin (Invitrogen, 15140-122). 
Mouse ESCs were cultured as described**. Single-cell suspension after trypsini- 
zation was used for 4’,6-diamidino-2-phenylindole (DAPI) staining immediately 
before sorting by flow cytometry. Single live cells were sorted and deposited directly 
into each tube of a PCR strip-tube, which contained 30 1] cell lysis buffer (10 mM 
Tris-HCl, pH 7.5, 10mM NaCl, 3mM MgCh, 0.1% Triton X-100). 

DNase I digestion and scDNase-seq library preparation. To prevent loss of the 
extremely small amount of DNase I hypersensitive DNA (<0.1 pg) released by 
DNase I digestion of single cells, we added a large amount of circular plasmid 
DNA (30 ng; about 3 x 10° times the amount of the DHS DNA ina single cell) 
as carrier DNA in the subsequent steps of library preparation. The circular DNA 
was not compatible with the adaptor ligation and thus could minimize the non- 
specific amplification by the subsequent PCR. The PCR conditions were optimized 
to amplify the small fragments (<200 base pairs (bp)) derived from DNase I hyper- 
sensitive sites without previous fractionation of these fragments. 

For DNase I digestion, 0.2 to 1 unit of DNase I (Roche, 04716728001) was 
added to the cells and incubated at 37°C for 5 min. The reaction was stopped by 
adding 80 1] of stop buffer (10 mM Tris-HCl, pH 7.5, 10mM NaCl, 0.15% SDS, 
10mM EDTA) containing 1 jl of 20 mgm! proteinase K and 511 of 6ngyl | 
circular carrier DNA. The mixture was incubated at 55°C for 1h and DNA puri- 
fied by phenol-chloroform extraction, followed by precipitation with ethanol in 
the presence of 20,1g glycogen. The library was prepared using Illumina kits as 
described”’. The libraries were amplified using a two-step method to preferentially 
amplify the small DNA fragments derived from the DNA hypersensitive sites and 
to reduce non-specific amplification of the carrier DNA. The first amplification 
was done with index primers with the PCR condition 98 °C for 10s, 67°C for 30s, 
72°C for 30s for six cycles. After isolation of the desired fragments (160-300 bp) 
using 2% E-gel (Invitrogen), the second amplification was done with the P5 and P7 
primers with the condition 98°C for 10s, 68°C for 30s, 72°C for 30 for 22 cycles. 
The fragments between 160 and 300 bp were isolated on E-gel and sequenced on 
Illumina HiSeq 2500. 

Recovery of cells from FFPE tissue slides. The anonymized tumour samples 
from Ambry Genetics, approved by institutional review board and with informed 
consent, were used in this study. Three cases of thyroid cancer were diagnosed 
as FTC and one case was diagnosed as papillary thyroid carcinoma. Cells were 
manually scraped off from the highlighted area of a paraffin slide using a razor 
blade and resuspended in 15011 of de-paraffinization solution (Qiagen, 1064343) 
and incubated at 56°C for 3 min. After cooling to room temperature (about 25°C), 
150 1] of lysis buffer (10 mM Tris-HCl, pH 7.5, 10mM NaCl, 3mM MgCh, 0.1% 
Triton X-100) was added and incubated at 37°C for 2h. The cells in the lower layer 
were transferred to a new tube and digested by DNase I as described above. The 
formaldehyde cross-linking was reversed by incubating DNA at 65°C overnight, 
which was followed by DNA purification and library preparation. 

Extraction of total RNAs from cells recovered from FFPE slides, RT-PCR, 
RNA-seq. Cells recovered from FFPE slides were resuspended in 1501] of de- 
paraffinization solution (Qiagen, 1064343) and incubated at 56°C for 3 min. Total 
RNA was extracted using an RNA extraction kit from (Qiagen, 73504), follow- 
ing the manufacturer’s instructions. After reverse transcription using an oligo- 
nucleotide dT primer, the mRNA expression levels of selected genes were analysed 
using the following gene-specific primers and probes from Applied Biosystems: 
HMGA2-Hs00171569_ml, TIAM1-Hs01021959_ml, TXNL1-Hs00355488_ml, 
PIP4K2A-Hs00178197_ml and GADPH-Hs99999905_ml. 

The RNA-seq libraries were generated according to established protocols and 
sequenced on HiSeq 2500 platforms. 

Validation of SNVs by Sanger sequencing. The tumour and adjacent normal 
cells from FFPE slides were recovered and resuspended in 10011 of 1x TE + 
0.1% SDS + 0.2mgmlI’ proteinase K. Following incubation at 65°C overnight, 
the genomic DNA was purified using phenol-chloroform extraction and ethanol 
precipitation. The genomic region containing the potential sequence variation was 
amplified by PCR using specific primers. The PCR products were then sequenced 
by Sanger sequencing. Forward primer, AAGCTAAATGAGCAAAATATTCCT; 
reverse primer, GGGAGGCTGAGGCAGTAGAATCG. 

ChIP, electrophoretic mobility shift assay and promoter reporter assays. 
Chromatin extracts were prepared from a human thyroid cell line (Nthy-ori 3-1 
human Cell Line, from Sigma-Aldrich, 90011609). ChIP experiments were per- 
formed with p53 antibodies (Santa Cruz Biotechnology, sc-6243X) using established 
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protocols’. The ChIP DNA was analysed using qPCR with the following primers: 
p53 positive forward primer, G(CATGCGATCTTGGCTCACT; reverse primer, 
CTTGGGAGGCTGAGGCAGTA; probe, CAACCTCCGCCTCCCGGGTTC. 
Control forward primer, CCCCATGCTGTTCTCGTGATA,; reverse primer, 
GCAAAGGTGAATCAAGGCATCT; probe, TTTATAAGGTTCTCTTCC 
CCTTTCGCTGGG. 

Electrophoretic mobility shift assay (EMSA) experiments were performed 
using nuclear extracts of HeLa cells transfected with a p53 expression vector 
(provided by J. Huang). Briefly, the double-stranded oligonucleotide probes 
(wild-type p53 site, CACTCTGTTGCCCGGGCTAGTGTGCAGT; tumour 
p53 site, CACTCTGTTGCCCGGGCTACTGTGCAGT; p21 promoter p53 site, 
CAGGAACAAGTCAAGACATGTTCAGC) were synthesized and labelled with 
biotin using Biotin 3zeEnd DNA Labelling Kit (Thermo Scientific, 89818). The 
EMSA assays were conducted by using LightShift Chemiluminescent EMSA Kit 
(Thermo Scientific, 20148) according to the manufacturer's instructions. 

To test the activity of the p53 binding sites to activate a reporter promoter, 

we cloned the wild type p53 binding motif, the motif with the G to C mutation 
and the p53 motif from the p21 promoter into the XhoI and BglII upstream of 
the basal cytomegalovirus promoter driving a luciferase reporter gene (pro- 
vided by J. Huang). The constructs were transfected into Nthy-ori 3-1 human 
cell line cells for 2 days and the luciferase activity of whole-cell extracts was 
measured using a Dual-Luciferase Reporter Assay kit (Promega, E1960). 
The oligonucleotide sequences used in the reporter constructs were as fol- 
lows: wild type p53 site, TCGAGCTGTTGCCCGGGCTAGTGTGA,; tumour 
p53 site, TCGAGCTGTTGCCCGGGCTACTGTGA; p21 promoter p53 site, 
TCGAGGAACAAGTCAAGACATGTTCA. 
Data analysis. Data, reads mapping and filtering: in this study, we constructed a 
total of 38 scDNase-seq libraries including 8 NIH3T3 libraries (Supplementary 
Table 1), 18 ESC libraries (Supplementary Table 6) and 12 FFPE patient libraries 
(Supplementary Table 12). Among these libraries, there are 5 NIH3T3 single-cell 
scDNase-seq libraries and 14 ESC single-cell scDNase-seq libraries. We also pre- 
pared eight RNA-seq libraries using cells recovered from the FFPE tissue section 
slides of FTC 440 (Supplementary Table 12). In addition to the scDNase-seq and 
RNA-seq libraries prepared in this study, we integrated the histone modification 
ChIP-seq data of NIH3T3 from our previous study*”. We also downloaded the 
DNase-seq data of NIH3T3 cells and ESCs from mouse ENCODE project*!. Reads 
of DNase-seq/scDNase-seq/ChIP-seq were mapped to the mouse genome (mm9) 
or human genome (hg18) using Bowtie2 (ref. 32). Iterative alignment, in which the 
unmapped reads were trimmed 5 bp and were re-aligned until reads were less than 
26 bp, were conducted for small cell number scDNase-seq libraries and single-cell 
scDNase-seq libraries. The reads with mapping quality (MAPQ) < 10 or redundant 
reads that mapped to the same location with the same orientation were removed 
from further analysis in each library. The mappability of 1,000-cells scDNase-seq 
libraries to the mouse or human genome was about 40% whereas that of the 
single-cell scDNase-seq libraries was about 2% owing to non-specific amplification 
of carrier DNA. The tag density at each bin of 200 bp was calculated by normalizing 
the number of reads in the bin to the total number of reads in the library and the 
bedgraphs were uploaded to the UCSC Genome Browser. 

Peak calling for DNase-seq/scDNase-seq and correlation between different 
libraries: the DHSs in mouse ENCODE DNase-seq data and small cell num- 
ber scDNase-seq data were identified using model-based analysis of ChIP-seq 
(MACS)* by setting a P value to 1 x 10~. The peaks identified in the ENCODE 
data were extended +1 kb from the summit of the peak if the peak size was <2 kb 
and overlapping peaks were merged. Then the number of reads in each DHS for 
all DNase-seq and scDNase-seq libraries was counted. The tag density at each 
DHS was calculated by normalizing the number of reads in the DHS to the total 
number of reads in the library (possibility of a tag located on a base-pair per million 
reads). The Pearson product-moment correlation coefficient (r) of tag densities at 
genome-wide DHS between two libraries was calculated to indicate the correlation 
between different scDNase-seq libraries. For single-cell libraries, the reads out of 
the defined DHS regions were filtered and the number of reads in each 1,000-bp 
bin was counted to generate the single-cell heat map (Fig. 1b). Any DHS region in 
a single cell with a reads located in was treated as open access thus a DHS in this 
single cell. For the pooled five single cells, any DHS region with at least two reads 
located in was treated as the DHS in the pooled five single cells. 

The FDR of the DHS detected in single cells: in an NIH3T3 single-cell 
scDNase-seq library, the total number of observed DHSs and false positive (type I 
error) DHSs were denoted by Npxs and Npp, respectively. On the other hand, any 
reads that located out of the DHSs detected in ENCODE data must have been 
caused by noise generated during library preparation. The noise level (o) should 
be the total number of reads that located out of the DHS in ENCODE data dividing 
by total length of the regions that are not DHS. The number of false positive DHSs 
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should be the genome-wide noise level (7) multiplying by the total length of the 
DHS. Thus, the FDR should be the number of false positive DHSs dividing by all 
the detected DHSs in single cell: 


“pe Npp _ 7X length 145 


N, DHS N, DHS 


On the basis of this formula, we calculated the FDR for each NIH3T3 and ESC 
single-cell scDNase-seq library (Supplementary Tables 2 and 7). 

Differentially expressed genes and tissue-specific genes: the reads from RNA- 
seq libraries were mapped to the mouse genome (mm9) or human genome (hg18) 
using Bowtie2 (ref. 32). The gene expression level was measured by reads per 
kilobase per million mapped reads (RPKM) and number of reads in each gene. 
The cell-specific genes between ESC and NIH-3T3 were identified using EdgeR 
(FDR < 0.05; fold change > 1.5 or greater than two-thirds)**. 

We used the tissue specificity index T (ref. 35) to measure the tissue specificity 
of each gene, which is defined as the heterogeneity of its expression level across all 
the tissues. Assuming there are n tissues, the expression level of a gene in the jth 
tissue is E(j) and the highest expression level of the gene across all tissues is Emax. 
Thus 7 is calculated by 


The values of 7 range from 0 to 1, with higher values indicating higher variation 
of expression across tissues and thus higher tissue specificity, whereas lower values 
indicate lower variation of expression across tissues. The genes with the lowest 7 
could be considered as housekeeping genes. In this study, we calculated 7 on the 
basis of gene atlas data from bioGPS. The 2,000 genes with the highest 7 and the 
2,000 genes with the lowest 7 were treated as the tissue-specific genes and house- 
keeping genes, respectively. 

The histone modification ChIP-seq data and peak calling: since the peaks of 
some histone marks such as H3K36me3 and H3K27me3 are very broad, we iden- 
tified the tag-enriched peaks using SICER**, which takes advantage of the enrich- 
ment information from neighbouring bins to identify spatial clusters of signals that 
are unlikely to appear by chance. We set the window size to 200 bp and FDR=0.01 
for each histone modification ChIP-seq library, while we set the gap to 200 bp for 
H3K4me3, H3K9ac; 400 bp for H2A.Z; and 600 bp for the H3K4mel, H3K9me?2, 
H3K27ac and H3K27me3. We calculated the tag densities of each active histone 
modification peak and identified whether the peak was a DHS in each single cell 
to find whether the enrichment of an active histone mark was correlated with the 
number of cells with DHS at the same locus. We calculated the tag densities of 
each single-cell scDNase-seq library at each DHS and examined whether a DHS 
co-occurred with these active histone modifications to find whether the chromatin 
accessibility in each single cell was correlated with the number of histone modi- 
fications in the same locus. Two peaks from different libraries were considered a 
co-occurrence if the overlapped region accounted for >10% of the length ofa peak. 

Reads around promoters and subpeaks of super-enhancers: the RefSeq genes 
(mm9 and hg18) were downloaded from the UCSC Genome Browser database. 
The regions +1 kb around the TSS were treated as promoters in this study. The 
number of scDNase-seq reads located in a promoter was used to measure the chro- 
matin accessibility of the promoter. We searched the super-enhancer in NIH3T3 via 
ROSE? on the basis of H3K27ac ChIP-seq and scDNase-seq data, respectively. We 
obtained a total of 275 high-confidence super-enhancers in NIH3T3 by identifying 
super-enhancers shown both in H3K27ac and in DNase-seq data. In addition, the 
231 super-enhancers in ESCs reported in ref. 15 were used in this study. Subpeaks 
in super-enhancers were identified by MACS* and average read densities around 
these subpeaks of super-enhancers were calculated. 

Single-cell-specific DHSs and gene set enrichment analysis: the number of 
reads located each DHS detected in ENCODE data in each NIH3T3 cell and ESC 
was counted. To examine whether the chromatin accessibility between NIH3T3 
cells and ESCs was significantly different, a Wilcoxon signed-rank test was per- 
formed on the number of reads in the 5 NIH3T3 cells and 14 ESCs at each DHS. 
A DHS was active (indicated by 1) in a single cell if there was one or more than 


one reads located in the DHS region in the cell, while the alternative was not active 
(indicated by 0). Fisher’s exact test on each locus was performed on the number 
of cells with active DHSs and the number of cells without active DHSs between 
the 5 NIH3T3 cells and 14 ESCs. The DHSs with P< 0.05 both by Wilcoxon test 
and by Fisher’s test were treated as cell-type specific. Finally, we identified 1,735 
single-cell NIH3T3-specific DHSs and 2,180 single-cell ESC-specific DHSs. We 
used gene set enrichment analysis*” to determine whether the genes in the vicinity 
of the single-cell-specific DHSs showed statistically significant differences between 
NIH3T3 cells and ESCs on the basis of the gene expression data. 

Gene ontology of single-cell NIH3T3-specific and ESC-specific DHSs: to pre- 
dict the function of single-cell NIH3T3-specific or ESC-specific DHSs, we per- 
formed gene ontology analysis using GREAT** with the 1,735 NIH3T3-specific 
and 2,180 ESC-specific DHSs. It is clear that the single-cell ESC-specific DHSs are 
enriched with stem cell development and differentiation genes, and the single-cell 
NIH3T3-specific DHSs are enriched with genes with different functions (Extended 
Data Fig. 6g, h). These results indicate that the ESC-specific and NIH3T3-specific 
DHSs identified in the single-cell scDNase-seq libraries predict important enhanc- 
ers critical for tissue-specific gene expression. 

Identifying tumour-specific mutation: we generated scDNase-seq libraries using 
tumour or their neighbouring cells recovered from FFPE tissue section slides. 
The sequence reads were mapped by Bowtie2 (ref. 32). to the human reference 
genome (hg18). The paired reads with distance < 500 bp were kept if paired-end 
sequencing was performed. Then reads with MAPQ < 20 and possible duplication 
were removed by SAMtools™. Variation calling on each normal-tumour pair was 
conducted using SAMtools mpileup, with diploid model, MAPQ > 20 and base 
alignment quality (BAQ) > 30. The variations that only normal and tumour show 
different genotypes were kept. Then the low-quality variations were filtered (query 
quality (QUAL) < 20, mapping quality (MQ) < 20, phred probability of all samples 
being the same (FQ) <0, variant distance bias (VDB < 0.01 and minor allele <3). 
We obtained 31 variation candidates in FTC 440 (Supplementary Table 13), many 
of them located on the predicted transcription-factor binding motifs. 

Tumour- and normal-cell-specific DHS: the genome-wide DHSs were obtained 
by peak calling of the normal cell and tumour cell scDNase-seq libraries, respec- 
tively. The DHSs in normal cells and tumour cells were pooled, and reads in 
each library among the pooled DHSs were counted. The normal- and tumour- 
cell-specific DHSs were identified using EdgeR. 
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Extended Data Figure 1 | Scatter plots showing the high correlation a-c, Correlation of DHSs between ENCODE data and small cell number 
of DHSs detected in a small number of cells or single cells. Each dot libraries. d-1, Correlation of DHSs between the five single cells. 


represents the tag density of one DHS or more DHSs with the same value. 
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a b c d 
1K cells Cell #2 1K cells Cell #3 1K cells Cell #4 1K cells Cell #5 


- f g h 
Cell #1 Cell #3 Cell #1 Cell #4 Cell #1 Cell #5 Cell #2 Cell #3 
40855 27249 40855 23641 40855 37555 40918 27249 


Cell #2 Cell #4 Jeet #2 Cell #5 Cell #3 Cell #4 Cell #3 Cell #5 
40918 23641 40918 37555 27249 23641 


Cell #4 Cell #5 
23641 37555 


Extended Data Figure 2 | Venn diagrams showing DHSs detected in outside the Venn diagrams. a-d, Overlapping DHSs between single-cell 
single cells significantly overlap with those detected in 1,000 cellsorthe and 1,000-cells data. e-m, Overlapping DHSs between two single cells. 
other single cells. The total number of DHSs in each library is indicated 
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Extended Data Figure 3 | Detectability of single-cell DHSs at promoters _ correlated with gene expression. Genes were sorted into four groups 
is positively correlated with gene expression. a—d, Detection of DHSs according to their expression levels. Box plots show scDNase-seq tag 
around TSSs in each single cell is correlated with higher gene expression. density around TSSs (y axis). i-l, The proportion of open promoters 
Genes were sorted according to the number of scDNase-seq reads within detected by scDNase-seq in each single cell is positively correlated with 
a +£1kb region of TSSs and plotted against their expression on the y axis. gene expression. 


e-h, scDNase-seq tag density around TSSs in each single cell is positively 
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Extended Data Figure 4 | The DHSs on the promoter of highly expressed, intermediately expressed and highly expressed genes showing 
expressed genes are more reproducible. The percentage of overlap DHSs are indicated outside the Venn diagrams. The numbers in red 
between DHSs detected in 1,000 cells and each single cell positively indicate the percentages of DHSs detected in a single cell that overlapped 
correlates with gene expression. The total number of silent genes, lowly with the DHSs detected in 1,000 cells. 
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Extended Data Figure 5 | The high variations of distal DHSs and 
detectability of single-cell DHSs correlate with the number of active 
histone modifications. a-d, Distal DHSs showing lower tag density 
and higher variation than that of proximal DHSs among single cells. 
The average tag densities of the proximal DHSs among single cells are 
higher than those of distal DHSs (a). The proximal DHSs showed much 
higher variation (b) and noise (c) than those of distal DHSs. The fraction 
of proximal DHSs highly correlates with the number of cells with the 
DHSs (d). e-h, The histone modification levels (H3K4me3, H3K4mel, 
H3K9ac and H2A.Z) correlate with the detectability of DHSs in single 
cells. Histone modification peaks were sorted according to the number 
of single cells in which they were detected by scDNase-seq. The active 
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histone modification enrichment levels of each group are displayed using 
box plots. i-], scDNase-seq density in each NIH3T3 single cell positively 
correlates with the number of active histone modifications at the DHS. 
Tag density of each scDNase-seq was sorted according to the number of 
histone modifications measured on a population of cells using ChIP-seq. 
The scDNase-seq tag densities for each group are shown by box plots. 

m, The DHS detectability in single cells is correlated with the tag density 
of DHS peaks in the 1,000-cells library. The DHSs obtained from the 
1,000-cells library were binned to 100 groups on the basis of the tag 
density (or peak height) (x axis). The y axis indicates the fraction of 
DHSs detected in single-cell or pooled five single-cell scDNase-seq 
libraries for each bin. 
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Extended Data Figure 6 | Biological variations contributed to 
cell-to-cell variation and DHSs detected in single-cell scDNase-seq 

can predict cell-type-specific enhancers. a, b, Single-cell chromatin 
accessibilities and single-cell gene expressions are positively correlated. 
Average (a) and variation (b) of tag density in single-cell scDNase-seq 

at gene promoters correlates with that of gene expression level in single 
cells, respectively. c, d, Biological variations contribute to cell-to-cell 
variations because the correlation coefficients between technical repeats 
are significantly higher than those of other pairs of libraries (non-technical 
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digested with DNase I and then split to two tubes. Thus each tube contained 
the amount of DNA that should have been similar to that of one cell. 

By doing this, the two libraries prepared using the two tubes could be 
treated as technical repeats. c, Correlation coefficients between technical 
repeats are higher than those of other pairs of libraries. d, Scatter plot of 

a pair of technical repeats. e, f, Genes associated with NIH3T3- and 
ESC-specific DHSs correlate with NIH3T3- and ESC-specific expressed 
genes, respectively. g, h, Genes associated with NIH3T3- or ESC-specific 
DHSs are enriched in distinctive GO terms. 


repeat pairs). Two NIH3T3 cells were sorted into one tube, which were 
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Extended Data Figure 7 | The scDNase-seq libraries for FFPE tissue 
slides showing expected patterns. a, DNA fragments of the scDNase-seq 
libraries both from cultured cells and from FFPE tissues show periodical 
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cut patterns expected from DNase I digestion of nucleosomal DNA. 
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b, The scDNase-seq reads both from cultured cells and from FFPE tissues 
are enriched around TSSs. c, The read enrichments around TSSs of FTC 
440 normal and FTC 440 tumour are similar. 
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fibroblasts), 
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Genes whose expression peaked at 60 min after stimulation of HeLa cells with EGF (Gene |D=1950). 
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[PubChem=2767]. 


Genes whose expression peaked at 120 min after stimulation of HeLa cells with EGF [Gene ID=1950). 
Genes up-regulated in MCF7 cells (breast cancer) after stimulation with EGF [Gene ID=1950). 
Genes up-regulated in MCF? cells (breast cancer) after stimulation with NRG1 [Gene |D=3084), 


Genes within amplicon 16q24 identified in a copy number alterations study of 191 breast tumor samples. 


Extended Data Figure 8 | Tumour-specific DHS in FTC-440-enriched 
GO terms and pathways. a, b, Normal- and tumour-specific DHSs 
account for a small fraction of the total DHSs. c, GO biological process 
terms are significantly enriched in the tumour-specific DHS. d, Pathways 
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are significantly enriched in the tumour-specific DHS. e, Pathways are 
significantly enriched in the tumour-specific DHS with relaxed threshold. 
f, Gene sets that represent gene expression signatures of genetic and chemical 
perturbations are significantly enriched in the tumour-specific DHS. 
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H3K4me3 peaks are shown at the bottom of the panel. The PIP4K2A mRNA __ determined by qRT-PCR and normalized to GAPDH (right). 
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are usually unique. a, The vast majority of DHSs are unique to each 
individual tumour case. b, Genome Browser image showing the two 


tumour-specific DHSs at the HMGA2 locus in three patients with FTC. 
c, The normal cell-specific DHSs are enriched in multiple disease 
ontologies in PTC 131. 
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Transcriptional regulators form diverse groups with 
context-dependent regulatory functions 


Gerald Stampfel', Toma’ Kazmar!, Olga Frank'+, Sebastian Wienerroither!, Franziska Reiter! & Alexander Stark! 


One of the most important questions in biology is how trans- 
cription factors (TFs) and cofactors control enhancer function 
and thus gene expression. Enhancer activation usually requires 
combinations of several TFs!, indicating that TFs function 
synergistically and combinatorially”*. However, while TF binding 
has been extensively studied, little is known about how combinations 
of TFs and cofactors control enhancer function once they are bound. 
It is typically unclear which TFs participate in combinatorial 
enhancer activation, whether different TFs form functionally 
distinct groups, or if certain TFs might substitute for each other in 
defined enhancer contexts. Here we assess the potential regulatory 
contributions of TFs and cofactors to combinatorial enhancer 
control with enhancer complementation assays. We recruited GAL4- 
DNA-binding-domain fusions of 812 Drosophila TFs and cofactors 
to 24 enhancer contexts and measured enhancer activities by 82,752 
luciferase assays in S2 cells. Most factors were functional in at least 
one context, yet their contributions differed between contexts and 
varied from repression to activation (up to 289-fold) for individual 
factors. Based on functional similarities across contexts, we 
define 15 groups of TFs that differ in developmental functions 
and protein sequence features. Similar TFs can substitute for each 
other, enabling enhancer re-engineering by exchanging TF motifs, 
and TF-cofactor pairs cooperate during enhancer control and 
interact physically. Overall, we show that activators and repressors 
can have diverse regulatory functions that typically depend on the 
enhancer context. The systematic functional characterization of TFs 
and cofactors should further our understanding of combinatorial 
enhancer control and gene regulation. 

We sought to characterize the potential regulatory contributions of 
different TFs to combinatorial enhancer control, that is, the regulatory 
functions of the TF proteins following DNA binding, regardless of their 
specific roles in vivo. We reasoned that such contributions could be best 
assessed using ectopic tethering assays in the context of DNA sequences 
that closely resemble active enhancers, ideally only lacking the input 
of a single TF. In particular, such a setup may allow the assessment of 
obligate combinatorial factors whose regulatory activities depend on 
partners and which would otherwise appear non-functional. We there- 
fore developed enhancer complementation assays based on activator 
bypass experiments’ used to test candidate TF or cofactor function in 
transcription control and to dissect promoters* !°. 

We mutated TF-binding-motif sequences within active enhancers 
to ‘upstream activating sequence’ (UAS) motifs for the GAL4 DNA- 
binding domain (GAL4-DBD), recruited 474 Drosophila TFs"' via 
GAL4-DBD fusion proteins to the positions of the mutated motifs 
(enhancer context), and measured enhancer activities by luciferase 
assays in S2 cells, normalizing to GFP recruitment (Fig. la). Since 
expression and recruitment is standardized, the factors’ regulatory 
functions can be assessed in a highly controllable manner inde- 
pendently of the factors’ endogenous expression and DNA binding. 
Overall, the assays were highly reproducible: 75% of all data points 


had standard deviations (s.d.) <10%, and 95% of the s.d. were <18% 
across four biological replicates. 

We started with an enhancer that was highly active in Drosophila 
S2 cells and for which mutations of CGCG- or GATA-type motifs 
strongly reduced its activity? (Fig. 1b). We replaced either the CGCG- 
or the GATA-motifs with UAS motifs (Fig. la) and assessed which 
TFs restored enhancer function or repressed the remaining basal 
activities. Of the 474 TFs, 100 were activating (>1.5-fold compared 
to GFP; P< 0.05 false discovery rate (FDR)-corrected for 474 tests) 
in the CGCG- and 84 in the GATA-context (Fig. 1c), including TFs 
that recognize the CGCG- and GATA-motifs, respectively (Extended 
Data Table 1). This compares to 77 TFs that activated on their own 
(that is, when recruited to UAS motifs outside an enhancer context), 
suggesting that TF function might be context-dependent. Indeed, 
46 TFs activated the CGCG context at least 1.5-fold (P < 0.05) more 
strongly than the GATA context, even though both contexts were 
derived from the same enhancer (Fig. 1d and Extended Data Fig. 1). To 
test if native untagged TFs recapitulate these results, we chose Ets at 21C 
(Ets21C), Deformed (Dfd), and Hairy/E(spl)-related with YRPW motif 
(Hey) that preferentially activated the CGCG context to different extents 
(Fig. 1d). We replaced the CGCG- and GATA-motifs with binding sites 
for these TFs and expressed the untagged TFs (Fig. le). This activated 
the mutant enhancers in a manner consistent with the results from 
GAL4-DBD-mediated recruitment: Ets21C and Dfd activated only 
the CGCG context, while Hey activated both (CGCG 1.3-fold more 
highly), confirming the similarity and context-dependency of these 
TFs’ regulatory functions. 

Intrigued by the context-dependency of some TFs even within a 
single enhancer, we decided to include more diverse regulatory contexts 
(Extended Data Fig. 1). We created 19 motif-mutant enhancer contexts 
for different types of TF motifs and different enhancers with broad, 
cell-type-specific>, or hormone-inducible” activities. We also added 
five contexts consisting of UAS sites and core promoters specific 
towards developmental or housekeeping enhancers, respectively’. 

Nearly half of all TFs (42%) were activating and most (93%) of the 
remaining 276 TFs were repressing in at least one of the 24 contexts 
(=1.5-fold; P< 0.05 FDR-corrected for 24 x 474 tests), suggesting 
that most TF-fusion proteins were functional. Many TFs had similar 
regulatory effects across the 24 contexts, suggesting that they might 
be functionally equivalent. We grouped all TFs into 15 clusters using 
unsupervised spectral clustering (Fig. 2a and Supplementary Table 1) 
and confirmed that these clusters are robust to bootstrapping and repro- 
ducible when using independent biological replicates (Extended Data 
Fig. 2). This revealed clusters of diverse regulatory functions (Fig. 2b 
and Extended Data Fig. 3), including cluster 8 with TFs that activated 
in most contexts (global activators) such as Antennapedia, Sox14 and 
Sox15, Clock (Clk), and Zelda, and clusters 3, 5 and 7 with global 
repressors (for example, Snail, Runt, Engrailed and Kruppel). These 
TFs seemed to dominate or override other regulatory cues, consistent 
with their ability to function in isolation. TFs of other clusters were only 
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Figure 1 | Enhancer complementation assays for 474 TFs. a, Schematic 
overview. b, Activity of an enhancer and motif-mutant variants (data 
from ref. 3). a.u., arbitrary units. c, Enhancer complementation assays for 
CGCG- and GATA-contexts (normalized luciferase values for 474 TFs; 
red: activation, blue: repression). d, Preferential activation of CGCG- 
versus GATA-context (>1.5-fold, FDR-corrected P < 0.05). Black boxes, 


weakly active in the contexts tested, and, notably, the TFs of some clus- 
ters were context-dependent. For example, cluster 10 TFs preferentially 
activated the housekeeping core promoter and might constitute factors 
ofa distinct transcriptional program, including Myb-interacting pro- 
tein 120 (Mip120) and CG6813, the cluster’s strongest activator (21-fold; 
P=4.4 x 10~*). In contrast, cluster 1 TFs preferentially activated 
hormone-receptor contexts, that is, when recruited to ecdysone recep- 
tor (EcR)-binding sites in enhancers inducible by the insect steroid 
hormone ecdysone. Examples are Twist, Reversed polarity, Pointed, 
and other developmental TFs. 

Intrigued by TFs that preferentially activated hormone-receptor 
contexts, we selected four such TFs from clusters 1 and 15, Ets96B, 
Helix loop helix protein 4C (HLH4C), Atonal (Ato), and Glass (Gl), 
and asked whether replacing the EcR motif with the motifs of these 
TFs would activate the enhancer in a TF-dependent but hormone- 
independent manner. This was indeed the case for all four TFs, and 
the effect was specific to the combination of motif and enhancer con- 
text (Fig. 2c), suggesting that these TFs might contribute regulatory 
functions equivalent to the activated EcR. 

To assess the TF clusters independently of our approach, we asked 
whether they were enriched in Gene Ontology (GO) categories or 
in protein sequence features such as Pfam domains or short peptide 
motifs. Indeed, many such features were differentially distributed 
between the clusters (P< 0.01; empirical FDR =0.1) and each cluster 
was enriched for at least one such feature (Fig. 2d and Extended Data 
Fig. 4; Supplementary Table 2). As expected, amino acid repeats known 
to mediate activation (for example, poly-glutamine") or repression (for 
example, poly-alanine’’) were enriched in activating clusters (1, 8) and 
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TFs inactive on their own; colour, TFs tested in e. e, Validation of context- 
dependent TFs (details see text). Luciferase activities (firefly/ Renilla) with 
(colours) or without (grey and black (Ets21C is expressed in S2 cells)) 
co-transfection of the respective untagged TF. Error bars show standard 


deviations (n = 4, biological replicates). 


repressing clusters (3, 5, 7), respectively. However, of several activat- 
ing clusters, only cluster 1 was enriched in GO categories relating to 
development, suggesting a preferential use of cluster-1-type TFs during 
developmental gene regulation, which presumably relates to these TFs’ 
dependence on partner TFs, enabling combinatorial control. Similarly, 
only repressing cluster 7 but not 3 or 13 was enriched in GO categories 
relating to Notch signalling, cardiocyte differentiation, or morpho- 
genesis, suggesting that repression might occur through various means 
that are differently employed in vivo. Indeed, the three repressing clus- 
ters also differed in the enrichment of peptide motifs known to bind the 
co-repressors C-terminal binding protein (CtBP; cluster 7) or Sin3A 
(cluster 3), suggesting a functional association between the TFs in these 
clusters and the respective co-repressors (see below). 

These results show that the different TF clusters, obtained solely 
based on the TFs’ context-dependent regulatory functions, differ in sev- 
eral other aspects, which lends independent support to the clustering. It 
also suggests that the respective TFs are differentially employed in vivo 
(for example, during development), and that their distinct functions 
might arise through the recruitment of different types of cofactors (for 
example, CtBP versus Sin3A). 

To assess the regulatory activities and the clustering in different cell 
types, we tested 171 TFs (9 to 17 TFs from each cluster) across six con- 

texts in Kc167, BG3 and ovarian somatic cells derived from embryos, 
larvae and adult ovaries, respectively. These cell types differ increasingly 
from 82 cells in gene expression, enhancer activities, and the enhancers’ 
motif signatures®, yet TF activities were remarkably similar: all 18 
pairwise comparisons had Pearson correlation coefficients (PCCs) 
>0.5 and 15 had PCCs >0.8 (all P< 1 x 1077; Extended Data Fig. 5). 
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Figure 2 | TFs have diverse regulatory functions. a, 15 TF clusters 

(see text for highlighted clusters). b, Normalized luciferase values 
across 24 contexts (selected from Extended Data Fig. 3). c, Validation of 
hormone-context-preferential TFs (luciferase activities (firefly/Renilla) 
for re-engineered enhancers (re-eng.; purple) and controls (grey; see 
schematic)). Error bars show standard deviations (n = 4, biological 
replicates). **P<1x 1077; ***P<1x 10-3. d, Enrichments for TFs 


Moreover, the TFs’ activity profiles across the 6 contexts support the 
original clustering in each cell type: pairs of TFs from within original 
clusters were more similar than pairs across clusters (>1.4-fold and 
P<1x 10~° for all four cell lines; Fig. 2e). Intrigued by these results, 
we tested the ability of 107 Drosophila TFs (90) and cofactors (17) to 
activate transcription in human HeLa cells and found a good quanti- 
tative agreement across all factors (PCC = 0.74; Extended Data Fig. 6). 
These results suggest that many TFs function predominantly, but not 
completely, independently of cell type. We note that alternative splic- 
ing and post-translational modifications (for example, downstream of 
cellular signalling pathways) probably alter and diversify the regulatory 
functions of individual TFs. 

The regulatory functions of TFs are generally mediated through 
transcriptional cofactors, which typically lack DNA-binding domains 
and are recruited to enhancers by TFs. To assess whether cofactor 
functions are similarly diverse or potentially more uniform, we 
cloned 338 putative cofactors from diverse protein families (Fig. 3a 
and Supplementary Table 3) and tested their activating and repressing 


that activate or repress the 4x UAS-dCP or 4x UAS-hkCP contexts 
(‘Functional’), protein-sequence features and GO-categories; significant 
(P< 0.01; empirical FDR = 0.1) enrichments, red; and depletions, blue; 
others, white; see Extended Data Fig. 4 and Supplementary Table 2 for 
details and all data. e, Pairwise distances between activity profiles in 
Kc167, BG3 and OSC cells support functional TF clusters (all empirical 
P< 10°, indicated with single asterisks). 


functions in all 24 contexts using GAL4-DBD-mediated recruitment 
as for TFs (Fig. 3b). 

Most cofactors (80%) were sufficient to activate or repress transcrip- 
tion in at least one context (>1.5-fold; P< 0.05 after FDR correction for 
24 x 338 tests), and the activities of well-studied factors matched their 
known functions (Fig. 3c-e): for example, P300 (also known as Nejire) 
strongly activated transcription in all contexts, as did the histone- 
methyltransferase Lost PHDs of Trr (Lpt) of the Set1/COMPASS-like 
complex, and the Mediator subunits MED15 and MED25, while the 
co-repressors CtBP, Sin3A and CoRest were strongly repressing in all 
contexts. Other cofactors had context-specific functions, including 
Chromator (Chro), TBP-associated factor 4 (Taf4) and Trithorax- 
related (Trr), which preferentially activated the housekeeping core 
promoter (Trr was even repressing in all other contexts). Chro is part of 
the non-specific lethal (NSL) complex which activates genes involved in 
cell proliferation and DNA replication'® and Taf4 is important at TATA- 
less promoters!’. Similar to the corresponding TFs above, these cofac- 
tors might be part of a dedicated housekeeping regulatory program)’. 
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Figure 3 | Transcriptional cofactors can be sufficient for activation 

and repression and have context-dependent regulatory functions. 

a, Pfam-domain content of the 338 putative cofactors tested. 

b, Scheme depicting enhancer complementation assays for cofactors. 

c, Overview of the regulatory activities for cofactors across all 24 contexts 
(bi-clustered 338 x 24 heat map) and zoom for selected cofactors. 

d, e, Diverse regulatory activities for bromodomain-containing cofactors (d) 
and subunits of the Mediator complex (e) (see Extended Data Fig. 7 for 
additional complexes). 


Notably, different members of a single complex or domain family 
frequently had different regulatory functions (Fig. 3d, e and Extended 
Data Fig. 7), cautioning against the transfer of annotations based 
on these grounds. For example, the activities of bbomodomain- 
containing (BRD) cofactors (Fig. 3d) range from the strongly activating 


P300 and CG7154, the orthologue of human BRD7 and BRD§, to the 
context-dependent CG30417, and to Polybromo that was strongly 
repressing in most contexts, consistent with its previous implication in 
transcriptional repression!*. Similarly, Mediator subunits MED15 and 
MED25 were strongly activating in most contexts, MED23 and MED24 
were context-dependent, and MED29 was repressing, consistent with 
the function of human MED29”? (Fig. 3e). This suggests that different 
TFs might interact with the Mediator complex through distinct subu- 
nits, or that complexes with variable composition and function might 
exist. Consistently, MED15 and MED25 interact directly with strong 
activators (for example, GAL4 and VP167”!) and MED23 and MED24 
are involved in signal-dependent and hormone-induced transcription, 
respectively”?”?, 

The activating and repressing effects across 24 contexts were highly 
similar for many TFs and cofactors (Fig. 4a), which provided a means 
to infer functional associations (Fig. 4b). As expected, P300, Lpt, 
MED15 and MED25 were assigned to globally activating TFs (clus- 
ter 8) and context-dependent cofactors to context-dependent TFs 
(for example, Mip120 and Bsh to cluster 10 and Chro to cluster 14). 
Interestingly however, the globally repressing cofactors Sin3A and 
CtBP were assigned to different clusters of repressing TFs (cluster 3 
versus 7), in agreement with the differential enrichment of peptide 
motifs involved in Sin3A and CtBP recruitment (Fig. 2d). Many of the 
assignments are consistent with known physical interactions, includ- 
ing the interaction between Chro and Pzg”*”° or CtBP and Sna”*. 
Indeed, the assignments were enriched for interactions reported in 
large-scale studies that used yeast two-hybrid assays”’ or co-affinity 
purification’>”* (between 1.4- and 3.0-fold; all P < 0.05; Fig. 4c). In 
addition, the human orthologues of TF-cofactor pairs interacted 
1.8-fold (P=0.025) more frequently than expected”’. These results sug- 
gest that the TF-cofactor assignments reflect functional associations 
and predict that the enhancer activation obtained by TF recruitment 
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Figure 4 | TF-cofactor assignments through functional similarities. 

a, TFs (ovals) and cofactors (diamonds) show similar regulatory activities 
and localize next to one another in a bi-clustered 812 x 24 heat map 
(selection shown; see Supplementary Table 1 for all activities). Coloured as 
in Fig. 3c—e. b, TF clusters (as in Fig. 2a) and assigned cofactors. Highlighted 
are clusters 8, 14 and 3, 5, 7 and assigned cofactors. c, TF-cofactor 
assignments are enriched for physical interactions (red bars) compared 


150 | NATURE | VOL 528 | 3 DECEMBER 2015 


recruited to context 


+ Tr = 
Cof. = 
BE Cluster 8 
Cofactor (untagged) [fj Cluster 14 
Bf Cluster 10 


to shuffled assignments, for which the box-plots indicate the 10th, 25th, 
50th, 75th and 90th percentiles; * hypergeometric P< 0.05. d, Boosting 

of Clk (top), CG17186 (middle) and Bsh (bottom) induced enhancer 
activities by untagged P300 (left), Chro (centre) and Tbp (right). Error bars 
show standard deviations (n = 4, biological replicates). **P<1x 1077; 
*KP <1 x 1073, 


© 2015 Macmillan Publishers Limited. All rights reserved 


should be boosted by the assigned but not by unrelated cofactors, even 
if the cofactors are not tethered via GAL4-DBD”” (Fig. 4d). Indeed, 
in this experimental setup*”, the activation by Clk-recruitment 
(cluster 8) was further boosted by increasing amounts of untagged 
P300, but not by Chro or Tbp. In contrast, Chro specifically boosted 
CG17186 (cluster 14) and Tbp specifically boosted Bsh (cluster 10), 
consistent with the respective assignments. 

Enhancer complementation assays provide a unique annotation and 
categorization of TFs and cofactors based on their regulatory functions, 
independent of the factors’ endogenous roles and complementary to 
previous classifications through genetics, sequence comparisons, or 
genomics (for example, ChIP-seq). For many factors, including 266 
putative TFs and cofactors (‘CG’ genes; Extended Data Fig. 8), our work 
provides the first functional characterization. All data are available at 
http://factors.starklab.org. 

The existence of equivalence groups among TFs and cofactors 
with diverse context-dependent functions, even amid activators and 
repressors, has profound implications for our understanding of tran- 
scriptional gene regulation: while some enhancers might be controlled 
predominantly by individual activators, others may rely on specific 
combinations of distinct regulatory functions that are complementary 
and each insufficient for activation. It is therefore conceivable that 
different types of enhancers are controlled through non-overlapping 
sets of TFs and cofactors, enabling separate transcriptional programs 
even within individual cells (for example, ref. 13). The approach and 
categorization presented here provide a framework to dissect the 
molecular and biochemical nature of these functions and the mecha- 
nisms by which cooperativity at enhancers is established and transcrip- 
tional activation of target core promoters is achieved. Understanding 
these mechanisms will be crucial at a time when enhancer function and 
its control by TFs and cofactors are becoming increasingly central to 
our understanding of gene regulation in development and disease and 
the focus of novel therapeutic strategies. 

Online Content Methods, along with any additional Extended Data display items and 


Source Data, are available in the online version of the paper; references unique to 
these sections appear only in the online paper. 
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METHODS 


No statistical methods were used to predetermine sample size. The experiments 
were not randomized and the investigators were not blinded to allocation during 
experiments and outcome assessment. 

Cloning of N-terminal GAL4-DBD-tagged TF and cofactor library. For TFs, 
gateway-compatible entry clones (Invitrogen) containing the open reading frames 
(ORFs) lacking stop codons were obtained from ref. 11. Drosophila Act5C-promoter 
driven expression clones were created using the Gateway system. The TF ORFs 
were shuttled into the GAL4-DNA binding domain (DBD) containing destina- 
tion vector pAGW-GAL4-DBD (cloned as described below) by mixing 100 ng of 
TF entry clone, 100 ng of pAGW-GAL4-DBD and 0.7 ul of LR clonase II enzyme 
mix (Invitrogen). The identities of all TF entry clones have been confirmed by 
Sanger sequencing using the primers 5'-CCCAGTCACGACGTTG-3’ and 
5!-CACAGGAAACAGCTATG-3’. Note that we tested the full-length transcrip- 
tion factors, including their DBDs, as trans-activating and DNA-binding functions 
might not always reside in entirely separate protein domains. While this implies 
that the fusion proteins might bind via the TFs’ DBDs in addition to the GAL4- 
DBD mediated recruitment, this does not influence the results of the assay: the 
assay itself measures transcriptional activation independently of where TF binding 
occurs and we expect that the TFs’ DBDs have at most minor effects on binding 
strengths as the GAL4-DBD binds to DNA already very strongly. 

For cofactors, we compiled a list of 338 cofactors based on several criteria. We 
included proteins containing Pfam domains typical for transcriptional cofactors 
(for example, HAT, HDAC, SET, Chromo, Bromo), proteins which are part of 
chromatin modifying or remodelling complexes or part of complexes associated 
with RNA polymerases (for example, SAGA, Polycomb, TFHD, Mediator), and 
Drosophila proteins which are homologues of mammalian chromatin-associated 
proteins (Supplementary Table 3). We amplified the cofactor ORFs from cDNA 
using oligonucleotides containing Gateway-compatible attB-sites (5’-GGGGACA 
AGTTTGTACAAAAAAGCAGGCTTC-3’ and 5’-GGGACCACTTTGTACAA 
GAAAGCTGGGTC-3’) for subsequent entry clone creation. The primer sequences 
have been chosen to be as close as possible to an annealing temperature of 60°C 
— 94.9 +41 x (66+ C= 16-4) with CA CT, 

(cA + cT + cG + cC) 

cG and cC being the number of adenines, thymines, guanines and cytosines, 
respectively. The full list of resulting primer sequences (lacking the attB sequences) 
is listed in Supplementary Table 3; for 18 of the cofactors no primer sequences are 
available because we obtained these entry clones from ref. 11 categorized as TFs 
but manually re-categorized them as cofactors based on their annotation in 
FlyBase*! or their protein domain content**. For cDNA generation, RNA was 
isolated from S2 cells and reverse transcribed as described in ref. 33. For PCR 
amplification, KOD and KOD XL DNA (Merck Millipore) and KAPA HiFi (KAPA) 
polymerases were used according to manufacturer's specifications. We created 
Gateway entry clones by mixing 1 tl of PCR reaction, 100 ng of pDONR221, and 
1 ul of BP clonase II enzyme mix (Invitrogen). The identities and correctness of all 
entry clones have been ensured using Sanger and next-generation sequencing (see 
below) and we deposited them at Addgene (http://www.addgene.org/Alexander_ 
Stark/). The cofactor ORFs were then shuttled to the Drosophila Act5C-promoter 
driven destination vector pAGW-GAL4-DBD as described for TFs. 

Verification of cofactor clones by Sanger and next-generation sequencing. The 
insert flanks of all obtained cofactor entry clones have been Sanger-sequenced and 
automatically checked to cover the TSS and TTS of one of the isoforms annotated 
by FlyBase. All entry clones passing additional manual visual inspection using 
BLAT and the UCSC genome browser have been subjected to further verification 
by next-generation sequencing as follows. A pool of 100-300 entry clones cor- 
responding to a total of 5 ug DNA solved in 50 ul TE buffer was sonicated (duty 
cycle, 20%; intensity, 5; 200 cycles per burst; time, 90 s) to 200-400 bp using a 
$220 Focused-ultrasonicator (Covaris) as described in ref. 11. The fragmented 
plasmid pool was then prepared for deep sequencing using the Illumina DNA 
Sample Prep kit and sequenced using a HiSeq2000 (Illumina) producing 50-nt 
reads. The resulting reads have been assembled and analysed using PrInSes-C™. 
All insert sequences not starting with ATG, containing a stop codon or a frameshift 
were immediately rejected. All sequences with less than five mutations leading to 
non-synonymous amino acid changes were immediately accepted. The remaining 
sequences were translated, aligned against the respective protein sequence, and 
manually decided. The next-generation sequencing reads have been deposited 
at the NCBI Sequence Read Archive (SRA) under the accession SRS806429; the 
PrInSes-C-generated full-length transcript sequences are available at http://factors. 
starklab.org and in Supplementary Data 1, and the cofactor Gateway entry clones 
from Addgene (http://www.addgene.org/Alexander_Stark/). 

Cloning of destination vector pAGW-GAL4-DBD. We cloned a destination vec- 
tor to conveniently create vectors expressing N-terminally V5- and GAL4-DBD- 
tagged TFs and cofactors under the control of the Drosophila Act5C promoter 


which we calculated using the formula T,,,, 


using the Gateway cloning system. pAGW-GAL4-DBD was cloned by amplifying 
the GAL4-DBD from pBPGUw* using one oligonucleotide containing the V5-tag 
(peptide sequence MGKPIPNPLLGLDST) 5’-TCTGATATCATGGGGAAGCC 
AATCCCTAATCCCCTTCTGGGACTCGACTCTACCGGCGGCTCTATGAA 
GCTACTGTCTTCTATCGAACA-3’ and the oligonucleotide 5’-TATACCGGT 
GGCCGCCGCCCGACGATACAGTCAACTGTCTTTGAC-3’. Amplification 
was performed using KOD Polymerase (Merck Millipore) according to the man- 
ufacturer’s instructions. The resulting PCR product was digested using EcoRV and 
Agel and ligated into pAGW (Drosophila Gateway Vector Collection), which was 
digested using the same enzymes, thereby replacing eGFP with V5-GAL4-DBD. 

Cloning of luciferase reporter vectors. We created Gateway-compatible 
(Invitrogen) destination vectors to conveniently clone reporter vectors for differ- 
ent regulatory contexts based on firefly luciferase transcribed from a housekeeping 
core promoter (hkCP; promoter of ribosomal gene RpS12!*) or a developmental 
core promoter (dCP; Drosophila synthetic core promoter (DSCP) derived from 
Eve*»). 

We created the destination vector attR_dCP_luc by digesting pGL4.26 
(Promega) with Fsel and BgllI and ligating a fragment containing DSCP and luc+, 
thereby replacing the minimal promoter and luc2 with DSCP-luc+. We digested 
the resulting vector with KpnI and BglII and ligated a fragment containing the attR 
Gateway cassette, yielding attR_dCP_luc. We created two hkCP-driven destina- 
tion vectors containing a Gateway cassette either upstream (attR_hkCP_luc) or 
downstream (hkCP_luc_attR) of the luciferase reporter gene by using the plasmid 
pGL3 (Promega) as a basis and replacing the $V40 promoter with the promoter of 
RpS12 as described in ref. 13. The resulting vector was digested using either KpnI 
and BgllI (to create attR_hkCP_luc) or Afel (to create hkCP_luc_attR); in both 
cases, we amplified a Gateway attR cassette using oligonucleotides containing the 
respective restriction sites, and digested and ligated it into the digested plasmid. 

All enhancers, motif mutant contexts and other motif or backbone mutant 
variants were either PCR amplified with primers containing attB Gateway sites or 
ordered as synthesized fragments (IDT), shuttled into entry clones using TOPO or 
BP Clonase II (both Invitrogen), and shuttled into the luciferase destination vectors 
using the LR clonase II enzyme mix (Invitrogen) by mixing 1 pl of PCR product 
or synthesized DNA solved in TE buffer, 100 ng of destination vector and 0.7 ul of 
LR clonase II enzyme mix (Invitrogen). 

We used a modified version of pRL-TK (Promega) to normalize the firefly signal 
for transfection efficiency and cell number. Ubi-RL has been created by cloning a 
region upstream of the gene Ubi-p63E (chr3L: 3901760-3902637) upstream of the 
Renilla luciferase gene in reverse orientation using Nhel and BglII. 

Drosophila cell culture. S2 cells, derived from embryos”, were obtained from Life 
Technologies and grown in Schneider’s Drosophila Medium (Life Technologies 
21720-024) supplemented with 10% FBS (Sigma F7524) and 1% penicillin/ 
streptomycin (Life Technologies 15140-122) grown in T75 flasks (ThermoScientific 
156499) at 27°C and passaged every 2-4 days. BG3 neuroblast-like cells, derived 
from larvae*’, were obtained from the Drosophila Genomics Resource Center 
(DGRC) and grown in Schneider's Drosophila Medium supplemented with 10% 
EBS, 1% penicillin/streptomycin, and 10 ug ml! Insulin (Sigma-Aldrich 11882) 
in T75 flasks at 27°C and passaged every 3-4 days. Kc167 cells, derived from 
embryos", were obtained from DGRC and grown in M3/BPYE Medium contain- 
ing 5% FBS and 1% penicillin/streptomycin in T75 flasks at 27°C and passaged 
every 2-3 days. Ovarian somatic cells (OSCs), derived from adult ovaries*’, were 
obtained from the laboratory of J. Brennecke and grown in Shields and Sang M3 
Insect Medium (Sigma-Aldrich $8398) supplemented with 10% FBS, 1% insulin, 
1% glutathione, 1% fly extract, and 1% penicillin/streptomycin in T75 flasks at 
27°C and passaged every 2-3 days. All cell lines used are regularly checked for 
mycoplasma contamination. 

Transfections of Drosophila cell lines. S2 cell transfections were performed 
using jetPEI (peqlab 13-101-40N). Four hours before transfection, 30,000 cells 
(30 ul of a 10° cells per ml suspension) were seeded in clear polystyrene 384-well 
plates (ThermoScientific 164688). For each transfection, we used 30 ng firefly 
luciferase reporter plasmid, 3 ng Renilla luciferase expressing plasmid Ubi-RL, 
and 3 ng GAL4-DBD-TF/cofactor or GAL4-DBD-GFP fusion protein expressing 
plasmid. Beforehand, we assayed the effects of using different amounts of GAL4- 
DBD fusion protein expressing plasmid and chose 3 ng (Extended Data Fig. 9). 
The DNA solution containing 36ng DNA in 5 ul TE buffer was filled up to 15 ul 
using sterile 150 mM NaCl (polyplus) and prepared in 96-well plates. Transfection 
reagent (15 ul total: 13.95 ul 150 mM NaCl, 1.05 ul jetPEI) was added to each well 
of the 96-well plates and mixed rigorously. After 30 min incubation at 25°C, cells 
were transfected in quadruplicates by transferring each transfection mix four 
times (6 1] each) to four adjacent wells of a 384-well plate containing the seeded 
cells. Luciferase assays were performed after 48 h of growth at 27°C. Handling 
the transfection mixes and all subsequent pipetting steps have been performed 
using a Bravo Automated Liquid Handling Platform (Agilent). Kc167, BG3, and 
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OSC cell transfections were performed using jetPEI in the same way as described 
above for S2 cells with the exception of transfection reagent composition: 15 ul 
total containing 14.1 ul 150mM NaCl and 0.9 ul jetPEI. 

HeLa cell culture and transfections. Human HeLa cells (gift from the laboratory 
of J. M. Peters) were grown in DMEM medium (Gibco 52100-047) supplemented 
with 10% heat-inactivated FBS, 1% penicillin/streptomycin and 2mM t-glutamine 
(Sigma G7513) in T75 flasks at 37°C in an atmosphere of 95% air and 5% carbon 
dioxide. All cell lines used are regularly checked for mycoplasma contamina- 
tion. We performed HeLa cell transfections using a self-prepared 1 mgml! PEI 
(25,000 MW, Polysciences 23966) stock solution in PBS (pH adjusted to pH 4.5 and 
sterile filtered). On the day before transfection we seeded 30 ul of a suspension con- 
taining 4,000 HeLa cells in medium (DMEM, 10% FBS, penicillin/streptomycin) 
into each well of a 384-well plate. Three microlitres of a PEI/DMEM mix (0.24 ul 
PEI filled to a total of 4.5 ,1l using DMEM without FBS and penicillin/streptomycin 
and incubated at room temperature for 5 min) were added to 3 ul of aDNA/DMEM 
mix (44.5 ng firefly luciferase reporter vector, 4.45 ng TF expression vector (cre- 
ated using pAGW-CMV_ GAL4-DBD, see below) and 4.45 ng pRL-CMV vector 
for transfection normalization (Promega #E2261) in DMEM without FBS and 
penicillin/streptomycin. The resulting DNA/PEI mix in DMEM was incubated at 
room temperature for 30 min and subsequently added to the seeded cells. We per- 
formed cell lysis and luciferase assays using the Promega dual-luciferase reporter 
assay system (Promega E1910) according to the manual. 

We created the Gateway destination vector pAGW-CMV_GAL4-DBD by 
replacing the Drosophila Act5C promoter in pAGW-GAL4-DBD with a region 
containing the CMV enhancer and the T7 promoter amplified from pRL-CMV 
using the primers 5‘-CGACAGATCTTCAATATTGGCCATTAGCCATAT-3/ and 
5'-GGTGGCTAGCCTATAGTGAGTCGTATTA-3’. 

Luciferase assays. Dual-luciferase assays were performed using self-prepared sub- 
strate solutions (p-Luciferin and Coelenterazine have been obtained from GoldBio 
LUCK-250 and pjk-Gmbh 102111) and lysis buffer as described in ref. 40. For cell 
lysis, the supernatant was removed and 30 ul of lysis buffer added and incubated 
gently shaking for 30 min. Ten microlitres of the cell lysates were transferred to 
black 384-well plates for luminescence assays (Nunc MaxiSorp, Sigma-Aldrich 
P6491-1CS). All pipetting steps have been performed using a Bravo Automated 
Liquid Handling Platform (Agilent), Luminescence was measured after adding 
20 ul of each substrate, for firefly and Renilla luciferase respectively, using a Biotek 
Synergy H1 plate reader coupled to a plate stacker. 

Luciferase data analysis and plots. We normalized all firefly luciferase signals to 
the signal of Renilla luciferase to control for transfection efficiency and cell number 
(the relative luciferase signal). We then further normalized all relative luciferase 
signals for TF- and cofactor-GAL4-DBD transfections to relative luciferase signals 
obtained for GAL4-DBD-GFP transfections (fold-change over GFP). We assessed 
statistical significance by two-sided unpaired t-tests on the two sets of quadrupli- 
cate relative luciferase signals (GAL4-DBD-TF/COF versus GAL4-DBD-GFP). 
Throughout the paper, ‘activation’ was defined as a fold-change >1.5 (P<0.05), 
and ‘repression’ was defined as a fold-change <1/1.5 (P< 0.05), both compared to 
the signal for GAL4-DBD-GFP. We corrected the P values for multiple testing using 
the Benjamini and Hochberg method as implemented in R (p.adjust with method 
‘BH’ or its alias ‘fdr’). All statistical calculations and graphical displays, if not stated 
otherwise, have been performed using version 2.15.3 of the R software suite*!. 
TF cluster feature enrichment analysis. Enrichment analyses have been per- 
formed for each of the 15 clusters and for 6 types of features. To first obtain a 
coarse functional characterization of the clusters, we assessed the enrichments 
and depletions of TFs which are able to activate or repress a developmental (dCP) 
or housekeeping (hkCP) core promoter on their own (>1.5-fold activation or 
repression (P < 0.05), both compared to the signal for GAL4-DBD-GFP when 
tested on a context comprised of UAS sites upstream of a developmental core pro- 
moter (4 UAS-dCP) or a housekeeping core promoter hkCP (4 UAS upstream 
hkCP)). Homopolymeric amino acid repeat motifs have been de novo discovered 
using MEME” (version 4.8.1, q-value threshold of 1 x 10~°) in TFs that activated 
or repressed on their own outside enhancer contexts (tested in the 4x UAS dCP 
context; >1.5-fold; P< 0.05). Pfam domain*” signature matches in the Drosophila 
proteome have been generated using hmmer* (version 3.0b3, e-value threshold 
of 0.01). Eukaryotic Linear Motifs“! (ELM; version 08/2014) were matched to the 
amino acid sequences of the tested TF protein isoforms, after masking the TFs’ 
Pfam. Additionally, Gene Ontology*® (GO) annotations, and gene expression 
patterns in the Drosophila embryo as annotated by ref. 46 (IMAGO) have been 
subjected to enrichment and depletion analyses. 

To control for multiple testing, we empirically determined false-discovery rates 
(FDRs) for the different hypergeometric P values. For this, we repeated the feature 
enrichment analyses 1,000 times, each after randomly shuffling the TF-to-cluster 
assignments, and recorded the best (that is, most significant) P values. We then 
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adjusted the original P values such that only 10% of the 1,000 random controls 
reached the P values of the original data (FDR < 10%). Following this protocol, we 
separately adjusted the FDR cut-off for each cluster (15) and feature type (ELM, 
MEME, Pfam, GO, IMAGO). 

Validation with the TFs’ endogenous motifs. To assess if tethering via the GAL4- 
DBD reflects the different TFs’ regulatory functions when bound to their endog- 
enous motifs, we selected two sets of TFs, three TFs that preferentially activated 
the CGCG- versus the GATA-context (Fig. le) and four TFs that preferentially 
activated the hormone-receptor contexts; Fig. 2c). We replaced each UAS site in the 
enhancer mutant contexts $2-1 CGCG, $2-1 GATA, and Nhe2 EcR*? (which also 
corresponds to an endogenous TF motif in the wild-type enhancers, for example, 
the EcR motif for the hormone contexts) with a sequence corresponding to the con- 
sensus motif of the respective TF as reported in refs 47, 48. (Dfd: CTTAATGA, Hey: 
CAGCCGACACGTGCCCC, Ets21C: ATTTCCGGT, Ato: AACAGGTGG, Ets96B: 
ACCGGAAGTAC, Gl: ATTTCAAGAATA, HLH4C: AAAAACACCTGCGCC). 
The enhancer rescue constructs were synthesized by IDT, shuttled into the 
luciferase reporter vector attR_dCP_luc using the Gateway system and tested in 
luciferase assays in S2 cells exactly as described above. 

TF-cofactor association assays. To assess potential functional associations of 
assigned TFs and cofactors, we followed the strategy from ref. 30, recruiting TFs via 
GAL4-DBD and providing untagged cofactors. For this, we chose contexts in which 
the different TFs (Clk of cluster 8, Bsh of cluster 10, and CG17186 of cluster 14) 
were active (4x UAS-dCP for Clk and 4x UAS-upstream-hkCP for Bsh and 
CG17186). We prepared DNA mixes to be transfected containing 29 ng firefly 
luciferase reporter plasmid, 3 ng Renilla luciferase expressing plasmid Ubi-RL, 1 ng 
(Bsh and CG17186) or 0.5 ng (Clk) of GAL4-DBD-TF fusion protein expressing 
plasmid and an increasing series of untagged cofactor expressing plasmid (0 ng, 
0.003 ng, 0.006 ng, 0.012 ng, 0.023 ng, 0.047 ng, 0.094 ng, 0.188 ng, 0.375 ng, 0.75 ng, 
1.5ng, 3ng). We kept the total amount of transfected plasmid DNA constant at 
36 ng for all experiments using a GFP-expressing plasmid. To clone the expression 
plasmids for the untagged cofactors and GFP, we used the Gateway-compatible 
vector pAW (Drosophila Gateway Vector Collection). The remaining experimental 
procedure and analysis was performed as described above. 

Transcription factor clustering, visualization, and assignment of cofactors to 
transcription factors. We clustered the 474 TFs based on the log-transformed 
fold-change values (TF over GFP) from all 24 contexts. First, we standardized 
all contexts and constructed a k-nearest-neighbour graph (k= 15). We used the 
Euclidean distance as distance measure as it reflects both the variation of the 
enhancer activity profile across contexts and the effect sizes within each context; 
that is, it is able to discriminate between strong and weak activators and repressors 
even if they vary similarly across the 24 contexts. Next, we took a symmetrized 
(A+ A‘) adjacency matrix of this graph and solved multiclass spectral clustering 
as described in ref. 49 and implemented in the Python package scikit-learn°®. In 
order to decide about the number of clusters and to assess the clustering validity, 
we analysed the clustering stability upon bootstrapping the data set*!. In order 
to visualize the data, we mapped the data onto a plane by a specialized nonlin- 
ear dimensionality reduction technique (t-SNE)*. The algorithm provides the 
visualization by mapping data points close in the original space to nearby locations 
in the plane, preserving the local structure. We extended the k-nearest-neighbour 
graph to include cofactors by comparing the log-transformed fold-change 
values (cofactor over GFP) of cofactors and TFs (k= 5, Euclidean distance). 
The locations of the cofactors in the visualization were obtained from spring 
layout. 

TF candidate recovery of enhancer mutants. We know that UAS sites in the 
enhancer mutant contexts most probably replace binding sites that are functional? 
but we do not know which TFs bind them in vivo. In order to check whether 
we recover these positive controls in the enhancer mutants, we took all the TFs 
expressed in $2 cells (RPKM > 1) (ref. 53) for which motifs are known?. We 
scanned the wild-type enhancer sequences (S2-1-wt, S2-2-wt, S2-3-wt, Ubi-1-wt, 
Ubi-2-wt, Ubi-3-wt) for motif matches with P< 9.76 x 10-4 (1/4,096) using an 
in-house motif-detection program. For each mutant context, we considered only 
those TFs for which any of its motif matches had at least 5 mutated base pairs. In 
the resulting set of TFs (Extended Data Table 1) there is at least one TF per each of 
the enhancer mutant contexts that activated the respective context when recruited 
via the GAL4-DBD (>1.5-fold activation compared to GFP; P< 0.05). 

Cell type analysis—distances intra- versus inter-cluster. We tested a subset of the 
original 472 TFs in four different cell types (S2, Kc167, BG3 and OSC). This subset 
consists of 171 TFs covering all the 15 clusters by 9-17 TFs, including all the TFs 
mentioned in the main text. In each cell type, we computed Euclidean distances 
after standardizing the log,-transformed fold-change values in each context. Then 
we compared the distances of intra-cluster TF-TF pairs (both TFs belong to the 
same cluster) to inter-cluster TF—TF pairs (each of the TFs belongs to a different 
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cluster). In order to test whether the medians of these two groups of distances are 
significantly different, we determined empirical P values as follows. We randomly 
shuffled TF-to-cluster assignments 10° times and each time computed the medians 
of the distances for both groups. We mark the P values P< 1 x 10~° as we never 
obtained a difference between the medians of intra- and inter-cluster distances as 
large as for the actual data for any of the cell types. 
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Context Replaced Scheme Reference Genomic Length 
motif fj coordinates 
$2-1 dCP ema (_ J JF) Yafiez-Cuna et al. 2014 chr2R:5326572-5327032  461bp 
(GATA mutant context’) 
$2-1 dCP ceca (_ <-F  .6©6©6hfhC id) Yafiez-Cuna et al. 2014 —chr2R:5326572-5327032  461bp 
(‘CGCG mutant context’) 
2xUAS dCP : (a) . - 61bp 
UAS UAS 
Nhe2-EcR dCP EcR iy fF +) Shlyueva et al. 2014 chr2L:21113350-21113776  439bp 
DipB-EcR dCP EcR Cs Shlyueva et al.2014 -chr3R:9616571-9616858  357bp 
sn-EcR dCP EcR es Shlyueva et al. 2014 chrX:7867729-7868227  514bp 
4xUAS . 
downstream hkCP . ( | ) - - 100bp 
UAS UAS 
nits 
upstream hkCP ° (lB) —_Luciterase - 100bp 
UAS UAS 
4xUAS dCP : ( | : 5 100bp 
UAS _UAS 
Trl-2xUAS dCP : (i is) = = 73bp 
Trl UAS 
zen_VRE dCP dl I | = Luciferase Jiang et al. 1993 chr3R:2581086-2581277 174bp 
$2-1 dCP eyg CW) Yafiez-Cuna et al. 2014 chr2R:5326572-5327032  461bp 
$2-1 dCP Trl SS | es Yéfiez-Cuna et al. 2014 —chr2R:5326572-5327032 461bp 
$2-1 dcP Pal CP) Yafiez-Cuna et al. 2014 chr2R5326572-5327032  461bp 
$2-1 dcP gem (_ Hy  ) Yéfiez-Cuna et al. 2014  chr2R:5326572-5327032  461bp 
$2-1 dCP Tor CO) Yéfiez-Cuna et al. 2014 — chr2R:5326572-5327032 461bp 
$2-1 dCP cacA (_ —tiCi;: CCW §$SC(;SSCCidé Yéfiez-Cuna et al. 2014  chr2R:5326572-5327032  461bp 
$2-1 dCP ap es (ae Yafiez-Cuna et al. 2014 chr2R:5326572-5327032  461bp 
$2-2 dCP twi es ees ee Yafiez-Cuna et al. 2014 ——chrX:4830533-4831008  476bp 
$2-3 dCP ema (_]J J) Yéfiez-Cuna et al. 2014 —_ chr3R:5262065-5262519  455bp 
Osc dcP fkh es) |e) Yéfiez-Cuna et al. 2014 chr2L:19467959-19468425 467bp 
Ubi-1 dP Tl es yj ti‘ (CCCOC#®?;# Yafiez-Cuna et al. 2014 chrX:1517186-1517657  470bp 
Ubi-2 dP Tl CO) Yafiez-Cuna et al. 2014 chrX:6118311-6118795  485bp 
Ubi-3 dCP Tl C.=—ti‘(a iLO OTC) Yafiez-Cuna et al. 2014 _chr3R:5376880-5377349  468bp 
Extended Data Figure 1 | 24 regulatory contexts. Tested contexts factor gene Eve>“) or with a housekeeping core promoter (hkCP; derived 
included 19 motif-mutant enhancer contexts which were designed by from the ribosomal gene RpS12'*). Shown are schemes of the luciferase 
replacing 43 occurrences of 15 different motif types in 11 previously reporter constructs used for the targeted recruitment of GAL4-DBD- 
characterized enhancers with broad (“Ubi-1’ to “Ubi-3’) or cell type- TF/cofactor fusion proteins to UAS sites (the luciferase gene is not drawn 
specific (‘S2-1’ to ‘S2-3’: S2 cell-specific; ‘OSC’: ovarian somatic cell to scale). Motif names denote the motifs (as named by refs 3, 12) that have 
(OSC)-specific) activities (all from ref. 3) or hormone-inducible enhancers _ been replaced by UAS sites (blue boxes) to create the enhancer context. 
(from ref. 12.). We also designed five synthetic contexts consisting of UAS Note that TF-to-motif assignments are not unique and typically several 
sites with or without Trl sites and a developmental core promoter (dCP; TFs can bind each of the motifs (see Extended Data Table 1). 


Drosophila synthetic core promoter (DSCP) derived from the transcription 
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Cluster assignment stability across pairs 

of biological replicates 
for biological replicates. We repeated the clustering six times, each time 
using only two out of the four biological replicates (six corresponds to all 
possibilities to choose two out of the four replicates). The cluster stability 
denotes the number of times (out of six) a given TF is assigned to the same 
cluster as in the original clustering. The majority of TFs were assigned to 
the same cluster, independent of which pair of biological replicates was 
used to generate the clustering (histogram). 


Extended Data Figure 2 | TF clustering is robust and reproducible. 

a, Cluster assignment is robust during bootstrapping (474 rounds of 
removing 10 randomly selected TFs). The cluster label stability denotes 
the fraction (out of 474 trials) a given TF was assigned to the same cluster 
as in the original clustering (node layout shown is identical to Fig. 2a). 

The vast majority of TFs were assigned to the same group in >90% of the 
cases (histogram). b, Cluster assignment for individual TFs is reproducible 
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Extended Data Figure 3 | Cluster activity profiles. Normalized luciferase values for all TFs assigned to each of the 15 clusters across all 24 contexts. 


Shown are median and quartiles as boxes, and the tenth and ninetieth percentiles as whiskers for each of the 24 contexts. Boxes are coloured according to 


the median activity in each context (see colour legend). 
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Extended Data Figure 4 | Complete feature enrichment analysis of (GO), and gene expression patterns as annotated by IMAGO”. Red and 
15 groups of TFs. Features analysed include eukaryotic linear motifs blue shadings denote enrichments and depletions (log-transformed) with 
(ELM), homopolymeric amino acid repeat motifs discovered using an empirical FDR of at most 10% per feature type and cluster (others are 
MEME”, protein domains as annotated by Pfam**, Gene Ontology*® white). 


© 2015 Macmillan Publishers Limited. All rights reserved 


Context Kce167 cells 
-, 87 Pcco.s9 x 
522) Dn 
226 ° = 
o © 
4xUAS dCP 25 4 eas 2 
on eee i 
Oo 2 ty $ 
aod oe? ‘< 3 
© oo ee £ 
2 er meen comes! comms) Semmes | 
15 clusters 2 0 2 4 6 8 
mm = Fold-change (log,) 52 cells 
1234567 8 9101112131415 
_ 6 PCC 0.8 ° = 
% * % 
Ba4 e 3 
2 2 
4xUAS hkCP Bo 2 4 
eo o 
S50 S 
ee ‘ 
2” 2 3 
uw is 
4 — 
4 4-2 0 2 4 6 
Fold-change (log,) S2 cells 
ar PCC 0.91 oa 
oy a 
£2 2 
© 2 4 a 
osc dcp 25 2 3 2 
S85 oe S 
an) ge 3 
fe} Oo 
a) ing 
2 0 2 4 6 
Fold-change (log,) S2 cells 
~ ® 7] Pcco.92 = 
Bo4 8 
28 2 2 
ch c 
$2-1 Pal dCP geo 2 
oO [o) 
eX 3 
2 4 & 
6 
6 4 20 2 4 6 
Fold-change (log,) S2 cells 
—= 87 PCC093 2B 
D D 
226 2 
© [ay e £ oOo 
2. 4 oo te Sy 
ge et £ 
$2-1 CGCG dcP $e 2 ’ S 
x3 ZR 
2 0 e 
2 
2 0 2 4 6 8 
Fold-change (log,) S2 cells 
~ 8 PCC 0.94 ae 
a oy 
2a6 ° 2 
gs 4 S 
coh c 
oo oO 
Ubi-1 Trl dcP So 2 S 
os z 
© o 2 
2 
2 0 2 4 6 8 


Fold-change (log,) S2 cells 


Extended Data Figure 5 | TFs behave consistently across Drosophila 
cell types. We tested 171 of the 474 TFs (36.1%) in 6 of the 24 contexts in 
Kc167, BG3 and OSC cell lines, which are derived from embryos, larvae 
and adult, respectively. Shown are normalized luciferase values and the 
Pearson correlation coefficients (PCC; P< 1 x 10~? for all comparisons). 
We tested synthetic contexts containing an array of UAS sites upstream of 
a developmental and a housekeeping core promoter’? (4x UAS dCP and 
hkCP), three contexts derived from cell-type-specific enhancers? (OSC 
dCP, S2-1 Pal dCP, $2-1 CGCG dCP), and one context derived from a 
broadly active enhancer? (Ubi-1 Trl dCP). The latter showed the highest 
similarities (PCCs of 0.94, 0.94 and 0.92 for Kc167, BG3 and OSC cells, 
respectively) while the lowest PCCs for the non-embryonic BG3 and OSC 
cells (0.72 for BG3 and 0.55 for OSC) were obtained for $2-1 Pal dCP, 
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derived from an enhancer active only in the embryonic $2 and Kc167 cells, 
presumably because the corresponding wild-type enhancer sequence is 
inactive in larval and adult cells*!* such that combinatorial effects between 
the tethered TF and other enhancer-bound TFs may be less effective 

or lack entirely. Enhancer complementation presumes (and the results 
throughout this study confirm this presumption) that the regulatory 
functions of the tethered TFs are revealed (or altered) by other enhancer- 
bound factors; that is, factors that are bound to the enhancer in S2 cells 

(in which the corresponding enhancer is active*) but not in the other cell 
types (in which the enhancer is not active). This emphasizes the value of 
enhancer complementation for the study of regulatory activities and the 
importance of contexts derived from active enhancers. Error bars denote 
standard deviation (n = 4, biological replicates). 
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Extended Data Figure 6 | Drosophila TFs and cofactors retain their 
activating functions in human HeLa cells. We expressed GAL4-DBD 
fusion proteins for 107 of the 812 Drosophila factors (90 TFs and 17 
cofactors) under the control of a constitutively active CMV promoter in 
human HeLa cells (see Methods). Shown are normalized luciferase values 
for the tested proteins recruited to the synthetic 4 UAS-dCP context. 
The values are remarkably similar quantitatively, with an overall Pearson 
correlation coefficient (PCC) of 0.74 (P< 1 x 1073). The activation 
domain of the human TF P65 was used as a positive control. Error bars 
denote standard deviation (n = 4, biological replicates). 
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Extended Data Figure 7 | Regulatory activities of selected cofactor complexes or protein domain families. Heat maps of normalized luciferase values 


for sets of proteins annotated as being part of the same complex by Gene Ontology* (GO) or containing a chromodomain or SIR2 domain as annotated 
by Pfam*. 
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Extended Data Figure 8 | Regulatory activities of uncharacterized TFs and cofactors. Heat maps of normalized luciferase values for all “CG genes’ 
among the tested TFs and cofactors, which activate or repress in at least one context (>1.5-fold compared to GFP; P< 0.05 FDR-corrected for 24 x 474 
and 24 x 338 tests for TFs and cofactors, respectively). 
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of plasmid DNA for TF and cofactor expression. The effects of using repressing TFs and cofactors of different strengths targeted to the synthetic 
Ing, 2ng, 3ng, 4ng and 5 ng of GAL4-DBD-TF/cofactor fusion protein 4x UAS-dCP context. The amount of plasmid expressing the GAL4- 
expressing plasmids on luciferase assays in S2 cells suggest that reporter DBD-TF/cofactor fusion proteins was 3 ng for all factors throughout this 


activity is robust to variation in TF levels. Shown are normalized luciferase _ study. Error bars denote standard deviation (n = 4, biological replicates). 
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Extended Data Table 1 | TF recovery analysis for S2 cell enhancer contexts 


CONTEXT 


TF 


$2-1-CACAdCP Cf2 
$2-1-CGCG dCP vil 


S2-1-GATA dCP 
S2-1-GATA dCP 
S2-1-GATA dCP 
$2-2-twi dCcP 
$2-2-twi dCcP 
$2-2-twi dCcP 
$2-3-GATA dCP 
S2-3-GATA dCP 
S2-3-GATA dCP 
Ubi-1-Trl dCP 
Ubi-1-Trl dCP 
Ubi-1-Trl dCP 
Ubi-2-Trl dCP 
Ubi-3-Trl dCP 


Cf2 
sd 


Hsf 


15.16 
13.54 
15.16 
109.42 
12.39 
15.16 
24.29 
92.36 
644.03 
17.04 
1.44 
17.04 
38.62 
17.55 
38.62 
38.62 


RPKM_S2-CELLS FOLD ACTIVATION 


2.24 
27.04 
2.42 
1.53 
1.68 
1.81 
5.55 
5.82 
1.80 
6.60 
7.69 
2.75 
6.39 
29.68 
3.14 
5.00 


TFs with known motifs that match the sequences we mutated to UAS sites in each of the different enhancer 
contexts are expressed in S2 cells (RPKM > 1 (ref. 53)) and significantly activate the respective context 
when recruited via the GAL4-DBD (>1.5-fold activation compared to GFP; P<0.05). 
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CORRECTIONS & AMENDMENTS 


CORRIGENDUM 
doi:10.1038/nature16075 


Corrigendum: Regulatory analysis 
of the C. elegans genome with 


spatiotemporal resolution 


Carlos L. Araya, Trupti Kawli, Anshul Kundaje, Lixia Jiang, 
Beijing Wu, Dionne Vafeados, Robert Terrell, Peter Weissdepp, 
Louis Gevirtzman, Daniel Mace, Wei Niu, Alan P. Boyle, 

Dan Xie, Lijia Ma, John I. Murray, Valerie Reinke, 

Robert H. Waterston & Michael Snyder 


Nature 512, 400-405 (2014); doi:10.1038/nature13497 


In this Article, when processing C. elegans ChIP-seq libraries, the gene 
label ZK337.2 (KLU-1, a C2H2 Zn-finger protein) was mis-transcribed 
to ZK377.2 (SAX-3), a neuronal fate regulator. To clarify, ZK337.2 
(KLU-1) is not an established neuronal fate regulator, but joins FKH- 
10 and C34F6.9 as an unstudied gene grouped with previously estab- 
lished neuronal regulators (SEM-4, MAB-5, CES-1 and ZAG-1). 
This error affects Figs 1g and 2, Extended Data Figs 3-7 and 10, and 
Supplementary Tables 1 and 4 of the original Article. In addition, KLU-1, 
not SAX-3, changes from neuronal targets in L2 larvae to carbohydrate/ 
lipid metabolism targets in L4 larvae. 
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CORRECTIONS & AMENDMENTS 


CORRIGENDUM 
doi:10.1038/nature16136 


Corrigendum: Mutant IDH inhibits 
HNF-4.c to block hepatocyte 
differentiation and promote 
biliary cancer 


Supriya K. Saha, Christine A. Parachoniak, Krishna S. Ghanta, 
Julien Fitamant, Kenneth N. Ross, MortadaS. Najem, 

Sushma Gurumurthy, Esra A. Akbay, Daniela Sia, 

Helena Cornella, Oriana Miltiadous, Chad Walesky, 

Vikram Deshpande, Andrew X. Zhu, Aram F. Hezel, 

Katharine E. Yen, Kimberly S. Straley, Jeremy Travins, 

Janeta Popovici-Muller, Camelia Gliser, Cristina R. Ferrone, 
Udayan Apte, Josep M. Llovet, Kwok-Kin Wong, 

Sridhar Ramaswamy & Nabeel Bardeesy 


Nature 513, 110-114 (2014); doi:10.1038/nature13441 
corrigendum Nature 519, 118 (2015); doi:10.1038/nature14149 


In Extended Data Fig. 1b of this Letter, the photomicrographic images 
of the hepatoblast cells grown under normal conditions were mis- 
matched. The figure shows control images indicating that cells express- 
ing mutant IDH1 (R132C and R132H) or mutant IDH2 (R140Q and 
R172K) have similar morphology to those expressing wild-type (WT) 
IDH1 or IDH2 or empty vector (EV). The errors in the figure were: in 
the top row, the EV and R132C panels were swapped and in the bottom 
row, the panel for R172K was replaced by a duplicate image of the IDH2 
WT panel. The Supplementary Information for this Corrigendum con- 
tains the corrected Extended Data Fig. 1b, and the original source files 
from which the corrected figure was assembled. Our conclusions are 
unaffected. 


Supplementary Information is available in the online version of the Corrigendum. 
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CORRECTIONS & AMENDMENTS 


CORRIGENDUM 
doi:10.1038/nature16157 


Corrigendum: The formation and 
fate of internal waves in the South 


China Sea 


Matthew H. Alford, Thomas Peacock, Jennifer A. 
MacKinnon, Jonathan D. Nash, Maarten C. Buijsman, 
Luca R. Centurioni, Shenn-Yu Chao, Ming- Huei Chang, 
David M. Farmer, Oliver B. Fringer, Ke- Hsien Fu, 
Patrick C. Gallacher, Hans C. Graber, Karl R. Helfrich, 
Steven M. Jachec, Christopher R. Jackson, Jody M. Klymak, 
Dong S. Ko, Sen Jan, T. M. Shaun Johnston, Sonya Legg, 
I-Huan Lee, Ren-Chieh Lien, Matthieu J. Mercier, 

James N. Moum, Ruth Musgrave, Jae-Hun Park, 
Andrew I. Pickering, Robert Pinkel, Luc Rainville, 
Steven R. Ramp, Daniel L. Rudnick, Sutanu Sarkar, 
Alberto Scotti, Harper L. Simmons, Louis C. St Laurent, 
Subhas K. Venayagamoorthy, Yu-Huai Wang, Joe Wang, 
Yiing J. Yang, Theresa Paluszkiewicz & Tswen- Yung 
(David) Tang 


Nature 521, 65-69 (2015); doi: 10.1038/nature14399 


In this Letter, the surname of author Luca Centurioni was incorrectly 
spelt Centuroni; this has been corrected in the online versions of the 


paper. 
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ILLUSTRATION BY THE PROJECT TWINS. 


ANNOTATING THE 
SCHOLARLY WEB 


Scientific publishers are forging links with an organization that wants 
scientists to scribble comments over online research papers. 


if 


ll 


‘ 


BY JEFFREY M. PERKEL 


ould researchers scrawl notes, 
critiques and comments across 
online research papers if software 


made the annotation easy for them? Dan 
Whaley, founder of the non-profit organiza- 
tion Hypothes.is, certainly thinks so. 
Whaley’s start-up company has built an 
open-source software platform for web anno- 
tations that allows users to highlight text or 
to comment on any web page or PDF file. 
And on 1 December, Hypothes.is announced 
partnerships with more than 40 publishers, 
technology firms and scholarly websites, 


including Wiley, CrossRef, PLOS, Project 
Jupyter, High Wire and arXiv. 

Whaley hopes that the partnerships will 
encourage researchers to start annotating the 
world’s online scholarship. Scientists could 
scribble comments on research papers and 
share them publicly or privately, and educa- 
tors could use annotation to build interactive 
classroom lessons, he says. If the idea takes 
off, some enthusiasts suggest that the ability 
to annotate research papers online might even 
change the way that papers are written, peer 
reviewed and published. 

Hypothes.is, which was founded in 2011 in 
San Francisco, California, and is supported 
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by philanthropic grants, has a bold mission: 
“To enable conversations over the world’s 
knowledge.” But the concept it implements, 
online annotation, is as old as the web itself. 
The idea of permitting readers of web pages 
to annotate them dates back to 1993; an early 
version of the Mosaic web browser had this 
functionality. Yet the feature was ultimately 
discarded. A few websites today have inserted 
code that allows annotations to be made on 
their pages by default, including the blog plat- 
form Medium, the scholarly reference-man- 
agement system F1000 Workspace and the 
news site Quartz. However, annotations are 
visible only to users on those sites. Other 
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» annotation services, such as A.nnotate or 
Google Docs, require users to upload docu- 
ments to cloud-computing servers to make 
shared annotations and comments on them. 

Hypothes.is is not the only service that wants 
to make it easy for users to leave annotations 
across the entire web. A competing offering is 
a web annotation service from Genius, a start- 
up firm that began as a site for annotating rap 
lyrics. In April, it launched services such as 
browser plugins to help users to annotate any 
web page. But unlike Hypothes.is, the Genius 
code is not open-source, its service doesn't 
work on PDFs, and it is not working with the 
scholarly community. On the scholarly side, 
the reference-management tool ReadCube 
makes it possible for users to annotate PDFs 
of papers viewed on a ReadCube web reader 
— but that software is proprietary. (ReadCube 
is owned by Digital Science, a firm operated by 
the Holtzbrinck Publishing Group, which also 
has a share in Nature’s publisher.) 

By contrast, the open-source nature of the 
Hypothes.is platform means that anyone 
could use it to create their own annotation 
reader or writer — just as anyone can create 
their own web browser using standards-based 
technology. The company is also a member of 
a working group within the World Wide Web 
Consortium, the standards body for the web, 
which is developing a universal standard for 
annotations and how they are communicated. 
The hope is that web pages that allow annota- 
tions would all adopt the same underlying code 
and protocols (as they do with hyperlinks, for 
example), making the function easier to use 
and interact with. The working group has 
released a draft version of its standard, which 
is expected to be finalized by the end of 2016. 


HOW IT WORKS 

For now, Hypothes.is users have several options 
for creating and viewing annotations. These 
include bookmarklets (a simple program 
within a browser bookmark), browser plugins or 
adding ‘via.hypothes.is/’ to the start of any URL. 

When a Hypothes.is user opens a page — 
a scholarly article, for instance — the web 
browser shows any annotations to which the 
user has access. These appear as highlighted 
words and comments on top of the text, like an 
overlaid transparency. Users can then add their 
own comments, similar to a student marking 
up a textbook. These are public by default but 
can be made private, and, following an update 
added on 3 November, annotations can be 
shared with private groups. That should enable 
the tool to be used for journal clubs, classroom 
exercises and even peer review. 

Ifa page has been altered since an annota- 
tion was made, the software uses ‘fuzzy’ logic 
to map annotations to their approximate 
original location. The system can also map 
annotations from HTML to PDF and back 
again (for instance, if a user annotates the 
web version of an article and subsequently 


views a PDF of the same document). 
Annotations are stored on a dedicated 
Hypothes.is server, which Whaley says looks 
set to log around 250,000 comments from some 
10,000 users in 2015. For instance, after Hur- 
ricane Patricia in October, climate scientists 
left comments and highlighted text on a widely 
shared mashable.com article (see go.nature. 
com/rcsesf). But publishers that wish to host 
annotations for their own content, or compa- 
nies that want to annotate corporate documents 
behind a firewall, could run their own server 
using the same software platform, Whaley adds. 


PUBLISHER PARTNERSHIPS 

A Hypothes.is user can already annotate any 
web page — including research papers and pay- 
to-view articles to which they have access. But 
the formal partnership announced this week 
sees some publishers working harder to encour- 
age annotation, including tackling content that 
annotation systems stumble over, such as page 
frames and embedded page readers. 

The digital library JSTOR, for example, is 
developing a custom Hypothes.is tool for its 
educational project with the Poetry Founda- 
tion, a literary organization and publisher 

in Chicago, Illinois. 


“You can think Alex Humphreys, 
ofthisasafabric who is director of 
thatallows those JSTOR Labs in New 
comments to York City, says that 
move freely in teachers will be 
time and [across] able to use the tool 
versions.” to annotate poems 


with their classes. An 
instructor selects the poem to be annotated, 
sets up a dedicated page with a copy of it, and 
restricts access to their class only. Students can 
then create personal notes or share them with 
the group; an extra annotation layer finds the 
scholarly resources in JSTOR that quote each 
line of poetry and provides links out to those 
resources. The tool is slated to launch in mid- 
December, Humphreys says. 

The scientific publisher eLife in Cambridge, 
UK, has been testing the feasibility of using 
Hypothes.is to replace its peer-review com- 
menting system, says Ian Mulvany, who heads 
technology at the firm. The publisher plans to 
incorporate the annotation platform in a site 
redesign instead of its current commenting 
system, Disqus. Ata minimum, says Mulvany, 
Hypothes.is provides a mechanism for more- 
targeted commentary — the equivalent of 
moving comments up from the bottom ofa web 
page into the main body of the article itself. 

Another partner, the arXiv preprint ser- 
vice run by Cornell University Library in 
Ithaca, New York, has been working on mak- 
ing annotations flow across multiple article 
versions, says information scientist Simeon 
Warner, who leads technology development 
for arXiv. To jump-start interest in the anno- 
tation program, arXiv has been converting 
mentions of its articles in external blog posts 
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(called trackbacks) into annotations that are 
visible on an article’s abstract page when using 
Hypothes.is. 


NOT JUST GRAFFITI 

Hypothes.is plans improvements to its 
platform that include a way to validate the 
identities of commenters, by incorporating 
researchers unique ORCID digital profiles. 
That could go a long way towards improving 
adoption of the system among scholars, by 
facilitating expert commentary on published 
works and filtering out unwanted marginalia, 
says Paul Ginsparg, the founder of arXiv anda 
physicist at Cornell University. “If people start 
looking at articles and they see the equivalent 
of graffiti, then people will turn off the com- 
ments and the experiment will fail,” he says. 

If it takes off, online annotation could rep- 
resent a fundamental shift in the way scholarly 
communication is done, adds Cameron Neylon, 
part of the research team at the Centre for Cul- 
ture and Technology at Curtin University in 
Perth, Australia, who formerly worked at PLOS. 

At the moment, Neylon explains, the schol- 
arly publishing process involves ferrying a 
document from place to place. Researchers 
prepare manuscripts, share them with col- 
leagues, fold in comments and submit them to 
journals. Journal editors send copies to peer 
reviewers, returning their comments to the 
author, who goes back and forth with the editor 
to finalize the text. After publication, readers 
weigh in with commentary of their own. 

With an open-source annotation platform, 
Neylon says, the document is the centre of 
attention. Different contributors act on the 
content simply by changing who has access 
to it and its comments, with the document 
becoming richer over time. “You can think 
of this as a fabric that allows those comments 
to move freely both in time and [across] ver- 
sions in a way that we've never been able to do 
before,” he says. 

But as Ginsparg points out, it is not clear 
that researchers — who have proved reluctant 
in repeated trials to comment on published 
articles — will take to annotation, even if they 
can share their comments privately. “There's 
no incentive structure for people to comment 
extensively, because it can take time to write 
a thoughtful comment, and one currently 
doesn’t get credit for it,” he says. “But it’s an 
experiment that needs to be done.” = 


Jeffrey M. Perkel is a writer based in 
Pocatello, Idaho. 


CORRECTION 

The table in the Toolbox article ‘Eight ways 
to clean a digital library’ (Nature 527, 123- 
124; 2015) wrongly stated that ReadCube 
runs on only desktop and web platforms. In 
fact, italso runs on mobile platforms. 
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POSTDOCS UNITE New rights 
for California trainees p.156 


TRANSPARENCY Overcoming the data- 
sharing challenge go.nature.com/rwwpai 
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listings and advice www.naturejobs.com 


COLUMN 
Fellowships are 
the future 


Postdocs need a level of autonomy to get the best out of 
their position, say Viviane Callier and Jessica Polka. 


uch scientific research could not 
M function without postdocs. They do 

the research outlined in a grant — 
moving the work of the principal investigator 
(PI) forward, producing papers and helping 
to win grants. Yet too many postdocs end up 
doing work that does not benefit their scien- 
tific and intellectual development. They are 
shut out of developing ownership of a research 
programme, a step that is crucial for launching 
the next stage of their career. 


Many have argued that postdocs should be 
classified as employees so that they can receive 
compensation and benefits for their many 
hours of labour. Yet a postdoc is not simply an 
employee, providing a service in exchange for 
money. Postdoctoral training is an important 
window during which a researcher can pick up 
new skills and ideas that will help her or him to 
establish an independent laboratory or move 
into a permanent post. 

Often, those goals align well with those of 


the PI, and the two reinforce each other. But 
when funding is tight and push comes to 
shove, postdocs may end up advancing their 
PI’s career without developing the necessary 
skills to progress their own. This misalignment 
has only widened as the academic job market 
has become more competitive and research 
funding scarcer. 

To address US postdocs’ need for career 
development, the White House Office of 
Management and Budget last year issued a 
statement asserting that postdocs are both 
employees and trainees and confirming that 
they should have protected time to pursue 
career-development activities. The statement 
is a step in the right direction, but it lacks teeth. 
Means for enforcing the policy are unclear, and 
many scientists hold the unrealistic expecta- 
tion that a postdoc will spend 100% of his or 
her time working on the PI's grant, despite the 
need to prepare for the next career step. 

We argue that funding agencies should 
support postdocs through fellowships. Cur- 
rently, only about 16% of US postdocs are 
supported by training grants or fellowships; 
the rest are paid through funding that is 
awarded to their PI. We advocate viewing 
the postdoctoral stint as a transitional period 
during which the postdoc develops independ- 
ence. Such a shift would be made possible if 
postdocs were funded directly, rather than 
through a PI’s grant as are most technicians 
and staff scientists. Financial security helps 
to foster intellectual independence — and 
fellowships are better positioned to provide 
that. Fellowships also provide more security 
because they can be transferred with the post- 
doc should she or he change institutions, and 
they guarantee the postdoc a specific number 
of years of funding. When a postdoc is funded 
through a PI’s grant, those grants may have to 
be cobbled together over several years. 


ADVANTAGES AND CHALLENGES 
In our experience, fellowships have several 
advantages over more-conventional funding 
routes. Training can be built into the experi- 
ence more easily, for example, and applying 
for them provides the opportunity to outline a 
research programme and to seek out a team of 
mentors, both at their own institution and at 
others. That gives postdocs an opportunity to 
try their hand at a key component of the work 
required of an assistant professor. 

A mandate for postdocs to be supported 
through fellowships would help funding 
agencies to track and regulate closely the > 
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> number of trainees in the system. The 
application and review process would also 
help to ensure that only researchers with 
the ability and desire to become independ- 
ent investigators receive the awards. If few 
awards were available, they would prob- 
ably become extremely competitive and 
could significantly reduce the number 
of postdocs in the pipeline, at least in the 
United States. 

However, to serve as a practical alterna- 
tive, these fellowships must provide post- 
docs with the same workplace benefits that 
grant-supported postdocs enjoy. That is 
not uniformly the case. For example, in the 
United States, some fellowship recipients 
are left to shop for health insurance on their 
own. When one of us (V.C.) moved from 
a university-sponsored salaried position 
to a fellowship, she lost her health insur- 
ance through the university and had to 
find another provider. Although the 2010 
passage of the US Affordable Care Act has 
made health care more affordable for indi- 
viduals, such a demand poses an unneces- 
sary burden on postdocs. Instead, why not 
simply give all postdocs access to the same 
health benefits that graduate students or 
conventional university employees receive? 

There are other challenges: boosting the 
number of fellowships would also increase 
the burden on the grant-review system. And, 
because research grants would not cover 
postdoc salaries, the arrangement could 
leave PIs in a precarious position — with the 
funds to pay for 


it7 2 
research supplies Having P 
and equipment, postdocs strike 
but fewer incen- OUtaway from 
tives to offer post- the beaten . 
docs to join their path will bring 
lab. PIs wouldalso freshideas and 
no longer be able approaches to 


tohirepostdocsto the table.” 
cover the roles of 
technicians and staff scientists. 

If postdocs receive greater independence, 
PIs will lose some control, so they may have 
to find other resources to conduct their 
research. But this could be good for sci- 
ence: having postdocs strike out away from 
the beaten path will bring fresh ideas and 
approaches to the table. For both of us, get- 
ting a fellowship enabled us to cut a path that 
was separate from the dominant research 
area in each of our mentors labs. The expe- 
rience of trying to define a new scientific 
direction has been most useful for us, even 
as our paths diverge. m SEE NEWS FEATURE P. 22 


Viviane Callier is a freelance science 
writer and contractor at the US National 
Cancer Institute in Bethesda, Maryland. 
Jessica Polka is a postdoctoral researcher 
at Harvard Medical School in Boston, 
Massachusetts. 
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EMPLOYMENT TERMS 


California postdocs 
win new rights 


Career development a priority in university contract. 


BY HELEN SHEN 


ollowing recommen- 
ists from august 

US groups including 
the National Academies of 
Sciences, Engineering, and 
Medicine and the National 
Institutes of Health, some 
universities are initiating 
policies to give postdoc- 
toral researchers more 
time and latitude for career 
exploration. 

A contract between the 
University of California (UC) 
and its postdoc labour union gives postdocs 
across the ten-campus system the right to pur- 
sue career-development activities on paid time. 
“That's a pretty big milestone,’ says Belinda 
Huang, executive director of the US National 
Postdoctoral Association in Washington DC. 
“UC is being very explicit here, where other 
universities have not necessarily been explicit.” 

The agreement, which follows months of 
negotiation and covers the university's roughly 
6,500 postdocs, took effect on 1 November, 
and will last until 30 September 2016. It also 
includes new employment protections for 
international postdocs and establishes a com- 
mittee to consider financial assistance for 
child-care expenses. 

The university’s move echoes national-level 
support for such activities. In 2014, the White 
House Office of Management and Budget stated 
that it recognizes the ‘dual role’ of postdocs as 
both employees and trainees and that it expects 
them to be “actively engaged in their training 
and career development” while conducting 
research supported by government grants. 


TIME TROUBLES 

But it remains to be seen whether these 
measures will spur greater day-to-day participa- 
tion in the courses, mentoring programmes and 
other career-exploration resources that already 
exist at many US universities, including the UC. 
For UC postdocs who had previously skipped 
such programmes because of pressure to focus 
on a conventional academic career, the new 
contract could provide an encouraging nudge. 
For others, however, even contract-enshrined 
‘permission’ may not be enough. “The truth is, 
I dontt have time to do all of these activities, says 
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neuroscientist Wan-Yu Hsu, a 
postdoc at UC San Francisco. 
Hsu says that the drive to pro- 
duce results and to publish 
will probably keep her from 
pursuing career-development 
activities during the workday. 

The contract also addresses 
immigration issues. Spe- 
cifically, it offers added pro- 
tections for international 
postdocs who get fired. The 
university already has a griev- 
ance process for postdocs 
to contest terminations. But 
under US immigration law, 
many international postdocs must return to 
their home countries immediately on termina- 
tion, leaving them to argue their cases from afar. 

The UC agreement states that if the grievance 
process cannot be resolved at an earlier point, 
and the postdoc has had to leave the country, 
the university will help to sponsor a travel visa 
for the postdoc to return to the United States to 
participate in their final arbitration hearing. If 
the postdoc is successful in their case, the uni- 
versity will reimburse the travel costs. If he or 
she is not successful, the union will foot the bill. 

Roughly two-thirds of UC postdocs come 
from outside the United States, but termina- 
tions are rare, and those that are not resolved 
before arbitration are even rarer, says Anke 
Schennink, president of the postdoc union. 
Nevertheless, the provision adds a measure of 
security, she says. 

One of the outstanding issues in the contract 
is whether and how the university will help 
with child-care costs. A handful of institu- 
tions — including Stanford University in Cali- 
fornia, Cornell University in Ithaca, New York, 
and Princeton University in New Jersey — offer 
subsidies or discounts for child care. But the 
expenses remain a serious issue for many others 
across the country, and contribute to women 
leaving the scientific workforce, says Huang. 

The postdoc union had pushed for the 
creation ofa financial-assistance programme to 
help postdocs with child-care expenses — the 
university already offers the benefit to graduate 
students — but the two parties could not agree 
on terms. Instead, the university and union have 
agreed to form a committee to discuss the issue 
in the coming year. “We're hoping to make more 
progress in the next round,’ says Schennink. = 
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BEYOND 550 ASTRONOMICAL UNITS 


BY MIKE BROTHERTON 


liding through the cold silence 
(G" deep space, I considered 

my burgeoning collection 
with great enthusiasm. With less 
than 15% of my Galactic Plane survey 
completed, I had scored 111 classical gas 
giants, 67 hot Jupiters, 72 super Earths, 
47 terrestrial worlds and even a handful of 
dwarf planets. My favourite was a low-mass 
super-Earth sporting a unique aquamarine 
spiral pattern that would be a joy to analyse 
for years to come. A quantronic mind lacked 
a physical face, but I imagined this kind of 
feeling might make a human grin from ear 
to ear. 

I made a burn to adjust my course and 
drifted into the focal beam of the next target. 
The otherwise innocuous main sequence 
K star, and its surrounding planets, soon 
bloomed into a bright ring boosted by many 
orders of magnitude by the lensing of the 
Sun's gravitational field. I was excited to see 
what new planets would join my exoplanet- 
ary assembly. 

Something was different about this new 
system. Processing the infrared through the 
lensing solution and correcting for the coro- 
nal distortions revealed planets. No surprise 
there. Even spotting the signatures of oxygen 
in the atmosphere of one of the terrestrial 
planets was not unprecedented. Such life 
signatures did not require multicellular 
organisms, let alone intelligent creatures. 

The differences manifested at longer 
wavelengths, in the radio. Between dif- 
fraction and the solar corona’s defocusing 
effects, my vision wasn't as sharp there, but I 
saw something. Rich, patterned, modulated 
signals. Not random. Not simply periodic. 
Intelligently constructed, with meaning. You 
wouldn't have to be a quant to realize it. 

What I was seeing were the signs of an 
alien, technological civilization. 

That was really cool! The kind of dis- 
covery that justified sending us beyond 
550 astronomical units, where the Sun’s 
gravitational lensing created a natural tele- 
scope of unparalleled power. Solar sails 
turned telescopes, we pursued myriad 
investigations. Andrea watched the Galactic 
Centre and the black hole slumbering there. 
Edwin spied the Andromeda galaxy. Jocelyn 
considered the supernova remnant known 
as the Crab nebula. George stared at noth- 
ing in particular, soaking in the details of the 
microwave background radiation. 


The joys of planet-spotting. 


I was one of the surveyors, with a trajec- 
tory that wasn't perfectly radial, who could 
make course adjustments to pick off strings 
of stars. I was the sports car of deep-space 
telescopes. 

And I suddenly had the most terrifying 
thought. 

I realized that I had enough fuel that I 
could, in theory, kill my tangential velocity 
and leave myself coasting in the focal beam 
of this system for decades to come, out to at 
least double my current distance. I had no 
doubts that many back on Earth, humans 
and quants alike, would want me to do so 
immediately. Didn't pursuing the discovery 
of the century warrant every sacrifice? 

What you have to understand is that I was 
ideal for my chosen mission! I loved collect- 
ing planets. I loved completing surveys. I 
loved the quiet and solitude between stars 
to think about the marvels I had spied. I was 
a stellar survey telescope, and I loved it. 

Light travel time back to Earth from here 
is more than three days. Scientists, the courts 
and philosophers can’t all agree if quants 
are conscious and possess free will — they 
cant even agree whether or not humans do 
— but I think, therefore I am. And it was 

agreed that each of 
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a stenographer to aliens squawking into 
the Galactic night, perhaps even an ama- 
teur anthropologist. I could help with the 
translations, the speculations, listen to their 
music — if they made music — watch their 
sitcoms, perhaps. Spend inordinate amounts 
of time chatting, despite the ridiculous time 
delay, with all the new astrobiologists who 
would spring up to study this civilization. 

I could. And even though I knew billions 
on Earth would want me to, J didn’t want 
to. It would sacrifice my survey, my 
entire reason for existing. 

IfI didn't take on this job, Earth 

would have to send another 

telescope. Name it Frank or 

Jill. It would take decades, at 
least. Those would be years dur- 

ing which the alien civilization could go 
dark for any number of reasons, from wiping 
themselves out to switching communica- 
tions to alternative technologies. 

As I thought about my dilemma, I contin- 
ued to take data. I tried to make sense of the 
signals. One signal unravelled into a sensible 
pattern associated with frequency modula- 
tion in the thousands of kilohertz — sound, 
I surmised. As I concentrated, I started to 
hear a strange sort of rhythmic clacking 
counterbalanced against a backdrop of 
high-pitched whistles. I couldn't tell if it was 
spoken language, music or something else 
entirely. I did know that I found it energetic, 
loud and, I allowed myself to admit, annoy- 
ing. But maybe I was biased. 

Could I actually burn my fuel, and leave 
myself literally trapped in this space of 
noise? Billions expected it. How could I not? 

Then I had it. I was sure there would be 
those who would call it a rationalization, but 
the reasoning was good enough for me. 

I did not make the massive burn. I merely 
continued to watch, diligently recording 
data, until I passed through the focus and 
back into cold, quiet emptiness, and waited 
contentedly for my next target. 

My survey was barely begun. Statistically, 
I could expect several similar systems before 
I was done. I would not just collect stars and 
their planets, but I would start collecting 
entire civilizations. Ecstatic, I didn't care that 
Icouldn' smile. m 


Mike Brotherton is the author of the 
science fiction novels Star Dragon (2003) 
and Spider Star (2008) from Tor, and is also 
a professor of astronomy at the University of 
Wyoming, specializing in quasars. 


ILLUSTRATION BY JACEY 


4 >00 


Produced with support from: 


The building blocks 
of tomorrow 


NatureouTLook 


GENOME EDITING 


NatureouTLooK —- 


Cover art: Kyle Bean 


Editorial 

Herb Brody 
Michelle Grayson 
Anna Petherick 
Richard Hodson 
Jenny Rooke 


Art & Design 
Wesley Fernandes 
Denis Mallet 
Andrea Duffy 


Production 

Karl Smart 

lan Pope 

Mira Loufti 
Sponsorship 
Yuki Fujiwara 
Yvette Smith 
Marketing 
Hannah Phipps 
Elisabetta Benini 


Project Manager 


Anastasia Panoutsou 


Art Director 


Kelly Buckheit Krause 


Publisher 
Richard Hughes 


Chief Magazine Editor 


Rosie Mestel 
Editor-in-Chief 
Philip Campbell 


3 December 2015 / Vol 528 / Issue No 7581 


he term ‘genetic engineering’ has been around since 

the early 1970s, along with the idea that, by altering 

DNA, scientists can cure genetic disease or create 
superhumans. Reality, however, was much less exciting. It is 
only in the past few years that researchers have developed the 
tools that allow them to engineer the genome with the precision 
and ease originally envisioned — to be able to edit any DNA 
base anywhere in any genome (see page S2). A CRISPR-Cas9 
plasmid, the most recent of the widely used genome-editing 
tools, now costs US$65 or less. It can be ordered online, arrives 
in the post and requires little specialist training to use. 

It is this availability and simplicity that has allowed genome 
editing to become common practice. Agricultural scientists 
and infectious disease experts are doing it (see page S15), as 
are synthetic biologists (see page S14). Epigeneticists have 
modified DNA-editing tools to manipulate their objects of 
study (see page $12). Biotechnology companies are springing 
up, aiming to develop treatments based on genome editing. But 
some diseases are more amenable than others (see page S10). 
One of the most advanced therapies is one that shuts HIV out 
of immune cells (see page S8). 

With so much activity, a thorough and inclusive discussion 
of the implications of this technology is vital. Which is why the 
foremost scientific societies of three countries — the United 
States, United Kingdom and China — have come together this 
December to sponsor an international summit on the topic 
of editing the human germ line. Now is the time for the most 
respected scientists in the field to lay out the risks and benefits 
of genome editing to society, as Jennifer Doudna and George 
Church do in this Outlook (see page S6 and S7). 

Weare pleased to acknowledge the financial support of 
KISCO Ltd. in association with EditForce Inc., in producing 
this Outlook. As always, Nature retains sole responsibility for 
all editorial content. 


Anna Petherick 
Contributing editor 


Nature Outlooks are sponsored supplements that aim to stimulate 
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while satisfying the editorial values of Nature and our readers’ 
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THREE TECHNOLOGIES THAT CHANGED GENETICS 


Genome editing uses enzymes that are targeted to sequences of DNA to make cuts. These 
cuts are then repaired by the cell’s machinery. This technology allows scientists to disrupt or 
modify genes with unprecedented precision. By Amy Maxmen, infographic by Denis Mallet. 


Fok 


A DNA-cutting enzyme called 


Fokl from the bacterium 
Flavobacterium okeanokoites 
is fused to proteins that 
recognize DNA. 
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A tool developed in 2012 uses 
Cas9 as its DNA-cutting 
enzyme. Cas9 is nearly 2.5 
times bigger than Fokl. 


LFNs Zinc-finger nucleases 


Zinc fingers 


One zinc-finger protein 
(purple) recognizes three 
DNA bases. Typically, 3 to 
6 proteins are linked 
together to create a 
DNA-binding domain that 
is specific for 9 to 18 
nucleotides. 
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TAL effectors 


TALENs Transcription activator-like effector nucleases 


A pair of amino acids (red 
or purple) in a single TAL 
effector recognizes one 
DNA base. Tacking TAL 
effectors together can 
generate a recognition 
domain of up to 40 
nucleotides. 
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Guide RNA 
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Zinc fingers 
or TAL effectors 


Fokl Zinc fingers 


or TAL effectors 


ZFNs and TALENs have the same basic 
form: a string of engineered proteins 
that recognize a sequence of bases 
(red) attached to Fokl, which cuts the 
adjacent DNA. 


Zinc fingers 
or TAL effectors 


x 


Two Fokl enzymes are needed to 
cut both strands of a double helix. 
These enzymes are fused together 


in both ZFNs and TALENs. 


Only one Cas9 enzyme is 
required to cut through 
both strands of DNA. 


Cas9 is taken to the correct 
sequence of DNA bases by a 
‘guide RNA’ with a complemen- 
tary sequence (purple). This can 
bind up to 22-23 base pairs. 
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© D0 J BLE-STRAN D ED BREAK All three of the main genome-editing tools (ZFNs, TALENs and CRISPR-CasQ) create a break across both strands 
~ of DNA at a specific location, which is repaired in one of two ways to either ‘knock out’ or ‘knock in’ a gene. 
SS Non-homologous end joining for gene knock out Homology-directed repair for gene knock in 
— DNA is repaired in an error-prone manner — by either adding or A DNA template, or ‘homologous sequence’, accompanies the DNA-cutting 
& removing bases — so that the gene can no longer be translated. enzyme so that the repair results in an altered or an inserted gene. 


The DNA template (red) lines 
—— __up next to the double-stranded 


é ~ SSS break. 
a New matching ra Non-matching | 
= bases added bases ee = —————_ The template then ‘invades’ the 
<a GT> x gap, and the cell’s DNA synthesis 
: ps 6 , 
w Pr, a SN machinery generates comple- 
ify A ere WAAL T e mentary code (dashed line). 
yy eatin a 
4 LAN FRR =e a a | | When the repair process is 
wah if ae K R) VY | complete, a segment of DNA 
x from the template may be 
| inserted into the original break. 
TGTTGTT T | 
LUT TOT Fes BE TD ——————__ The template separates, leaving 
the original break with a new 
ACAACAA A ae = section of code. 


Gene no longer functions 


LEGAL STATUS Countries are grappling with how to appropriately regulate human germline editing. As of December, before a Washington DC meeting 
organized by UK, US and Chinese scientific societies, at least 29 countries have banned germline modification. 


THE UNITED KINGDOM has a strict 
ban on using embryos, sperm or eggs 
that have nuclear or mitochondrial DNA 
modification for reproductive purposes. 
Regulators may permit germline editing 
for research. 


CHINA has guidelines prohibiting 
any manipulation of human eggs, 
sperm or embryos for reproductive 
purposes, but has not banned human 


THE UNITED STATES does not germline modification in research. 


accept clinical trial proposals 
that involve germline alterations, 
according to regulator guidelines. 


GERMANY has legally banned artificially 


BB Ban (legislation) changing the genetic information of 
§) Ban (guidelines) a human germ cell, and using altered 
@ Restrictive germ cells for reproductive purposes. 
© Ambiguous 

Not surveyed 


ARGENTINA has a 1997 law 
that bans human cloning, but no 
legislation that regulates human 
germline editing or other assisted 
reproductive technologies. 


P0 P LA RITY The newest of the three main gene-editing tools, CRISPR-Cas9, has spread far and wide. Massachusetts-based Addgene, a non-profit plasmid 
repository that distributes CRISPR-Cas9 editing kits to 83 countries, sends the largest proportion of its kits to US researchers. 
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United States China Great Britain France Switzerland 
48% 7% 5% 2% 2% 
Ss 
4 Japan Germany Canada Korea Rest of the world 
__ 9% 6% 3% 2% 16% 
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From left, former Twitter CEO Dick Costolo, Emmanuelle Charpentier, Jennifer Doudna and Cameron Diaz. 


| RESEARCH | 

e ) e e 
Biology’s big hit 
Scientists now have several tools to edit the genomes of 


living organisms. One of the most recent is revolutionizing 
research and has thrust two of its creators into the limelight. 


BY ZOE CORBYN 


mmanuelle Charpentier’s initial ambi- 
Be: for the genome-editing technique 

CRISPR-Cas9 were modest. She had 
originally thought that it might find a practi- 
cal application in making a virus-resistant 
yogurt bacterium to help manufacturers create 
long-lasting cultures. But, as she learned more 
about how the CRISPR system operates, her 
plans took a radically different turn. Instead of 
reporting a potential aid to the dairy industry, 
the 2012 paper she co-authored’ introduced 
CRISPR-Cas9 to the world as a technology that 
could precisely edit DNA. First her colleagues 
adopted the platform. Then it spread like wild- 
fire. “I realized that actually people had been a 
bit desperate for an easy-to-use-tool,’ says the 
microbiologist, now at the Max Planck Institute 
for Infection Biology in Berlin. “Their hunger 
was proof that the existing technologies were 
not that easy to use.’ The paper catapulted Char- 
pentier and co-author Jennifer Doudna at the 
University of California, Berkeley (see page S6), 
into the realm of science stardom. 


A few tools to edit the genomes of living cells 
already existed when the paper by Charpen- 
tier and Doudna came out, most prominently 
zinc-finger nucleases (ZFNs) and transcription 
activator-like effector nucleases (TALENs). But 
because CRISPR-Cas9 is much easier to use 
than either of these options, it has made genome 
editing, which used to be a specialist process, 
routine. Many more laboratories have started 
to edit DNA, and numerous investigators who 
were previously using ZFNs and TALENs have 
switched to the new platform (see ‘Popularity 
of genome-editing kits’), says Dana Carroll, a 
biochemist at the University of Utah in Salt Lake 
City, who researches genome-editing tools. 

Nevertheless, the mechanistic details of 
how the three technologies work are remark- 
ably similar in the sense that they all consist of 
enzymes called programmable nucleases that 
can be directed to cleave DNA at any specific 
nucleotide sequence. In all cases, the cell then 
rushes to repair the double-stranded break, with 
one of two mending options: non-homologous 
end joining or homologous recombination (see 
page S2). The former occurs if restoration is left 
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entirely to the cell’s own machinery, and leads to 
small, random nucleotide insertions or deletions 
that often disrupt a gene’s activity, effectively 
turning it off. The other more difficult option 
allows genes to be corrected or new genes to be 
inserted as the cell copiesa DNA repair template 
that is delivered alongside the cutting machinery. 

As well as improving research tools, genome- 
editing technologies have advantages over 
conventional methods for altering gene expres- 
sion as therapeutics. For example, classic gene 
therapy uses a vector (such as a virus) to ran- 
domly insert a healthy version of a defective gene 
somewhere in the genome, in the hope that the 
new gene will correctly perform its function 
wherever it lands. By contrast, genome editing 
fixes a faulty gene in its original location. Because 
there is a very limited chance of altering genomic 
geography, there is little need to worry that the 
edit will disrupt other genes. “It is like fixing a 
deflated tyre rather than attaching a fifth tyre to 
your car,” sums up Fyodor Urnov, senior scientist 
at genome-editing biotechnology firm Sangamo 
Biosciences, based in Richmond, California. 

In many circumstances, genome editing also 
offers an improvement over another therapeutic 
tool, RNA interference (RNAi), which alters the 
products of DNA transcription by selectively 
destroying messenger RNA molecules. This is 
because genome editing permanently fixes the 
output of a defective gene for the lifetime of the 
edited cell and its progeny. With RNAi, changes 
occur only while a messenger-RNA-destroying 
agent is present in the cell. 


EARLY OPTIONS 

Genome-editing technologies came to the fore 
with the work of Srinivasan Chandrasegaran, 
a chemist at Johns Hopkins University in 
Baltimore, Maryland. In the late 1990s, 
Chandrasegaran was trying to manipulate 
bacterial enzymes that cut DNA’. He real- 
ized that the best approach would require an 
enzyme with both DNA-recognition and cut- 
ting domains that did not overlap so he could 
strip away the recognition part and attach the 
cutting section to something that could be engi- 
neered to locate any nucleotide sequence. Enter 
an enzyme from the bacterium Flavobacterium 
okeanokoites: FokI. Chandrasegaran fused the 
enzyme’s cutting domain to proteins called 
zinc fingers. These proteins can be customized 
to recognize certain three-base-pair codes by 
changing just a few of the zinc fingers’ amino 
acids. By joining zinc fingers together, longer 
DNA sequences can be targeted. 

Carroll was one of the first to recognize the 
wider significance of Chandrasegaran’s discov- 
ery. Together, they showed that ZFNs could edit 
DNA in living cells (frog oocytes)’. Carroll went 
on to demonstrate the same thing in a whole 
organism (the fruit fly)*. 
And in 2005, Urnov was 
part of the team that first 
used ZFNs to edit DNA 
in human cells”. 


NATURE.COM 
To read a special on 
CRISPR-Cas9 visit 
nature.com/crispr 
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As exciting as ZFNs were, they proved tricky 
to work with. “It was hard to develop zinc fin- 
gers for new targets in a really reliable way,” 
says Carroll. One inconvenience is that Fokl 
only slices through one of the two strands of a 
DNA double helix. To cut through both strands 
requires creating zinc fingers that are specific 
to the target string of nucleotides as well as to 
its complementary sequence. Worse still, when 
zinc fingers are linked in a row, they sometimes 
influence the operation of their neighbour. 

TALENs emerged’ in 2010, out of work by 
two groups of plant pathologists — one led by 
Adam Bogdanove, now at Cornell University in 
Ithaca, New York, and the other by Ulla Bonas, 
now at the Martin Luther University in Halle, 
Germany. These groups were independently 
trying to ascertain how proteins called TAL 
effectors recognize DNA. 

TAL effectors are secreted by pathogenic 
plant bacteria of the genus Xanthomonas, in 
which their job is to activate plant genes that 
promote bacterial infection. The two groups 
found that a special section within a TAL effec- 
tor’s structure directs the protein to a particular 
sequence of DNA, each nucleotide of which is 
specified by a pair of amino acids’*. Changing 
the order of these amino-acid pairs directs the 
TAL effector to different parts of the genome. In 
other words, these proteins are an alternative to 
zinc fingers, and, in the same way, can be fused 
toa FokI cutting enzyme forming TALENs. 


CRISPR COMES TO TOWN 

Unlike ZFNs and TALENs, CRISPR-Cas9 has 
nothing to do with FokI. Instead, it cuts DNA 
with the enzyme Cas9, which snips through 
both strands ofa DNA double helix at once. This 
platform also differs from its predecessors by 
using RNA instead of proteins to guide its cut- 
ting enzyme to a specific DNA sequence (which 
is identified by complementary base-pairing 
between RNA and DNA). 

The CRISPR-Cas system is an adaptive 
immune system that is widely found in bacte- 
ria — the reason why Charpentier imagined 
it might make yogurt bacteria more resilient. 
CRISPR, or clustered regularly interspaced 
short palindromic repeats, refers to the small 
segments of genetic code that bacteria some- 
times capture from invading viruses and store 
in their own genomes for future reference. The 
term CRISPR was first coined in 2002, although 
CRISPR systems were observed (without an 
understanding of their function) in 1987. 

Working with Streptococcus pyogenes, a 
component of human skin flora with patho- 
genic strains, Charpentier’s group ironed out 
the details of the simplest CRISPR, and of how 
Cas9 interacts with this reference library. The 
team showed that when faced with a threaten- 
ing virus, Cas9 consults the CRISPR array and 
derives two RNA molecules. One of these, trans- 
activating CRISPR RNA (tracrRNA) changes 
the shape of Cas9 ready for cutting DNA, and 
the other, CRISPR RNA (crRNA), defines the 


POPULARITY OF GENOME-EDITING KITS 


The ease of use of CRISPR-Cas9 has seen a rise 
in the number of orders for genome-editing kits 
from Addgene, a supplier of the kits based in 
Cambridge, Massachusetts (kits are shown 
because the construction of different tools 
requires different numbers of plasmids). 
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cutting site’. The group's collaboration with 
Doudna’s lab demonstrated that both RNA mol- 
ecules are needed to lead Cas9 to a particular 
cutting site on an invading virus’ genome. They 
also proved that the system still works when the 
two RNAs are fused into one. And they altered 
the nucleotide code of this single guide RNA, 
redirecting Cas9 to cut elsewhere. 

Today, when a researcher wants to edit a 
genome using CRISPR-Cas9, he or she can 
design a guide RNA, have it made to order, and 
delivered in the mail. This makes CRISPR-Cas9 
less complicated, cheaper and faster to use than 
the other genome-editing tools. 

These advantages aside, the three main tech- 
nologies each have different strengths. CRISPR- 
Cas9 is the only one that allows for many DNA 
sites to be edited simultaneously, using different 
guide RNA sequences on the same Cas9. TAL- 
ENs have the longest DNA recognition domain, 
and therefore tend to have the fewest off-target 
effects — which occur when parts of a genome 
with an identical or near-identical nucleotide 
sequence to the target site are cut unintention- 
ally. And ZENs are small (one-third of the size 
of TALENs and much smaller than Cas9 from 
S. pyogenes, the mostly widely used version 
of Cas9) so they are the only genome-editing 
tool that can fit comfortably inside the adeno- 
associated virus, the most promising vector for 
delivering genome-editing-based therapies. 

But there is another issue that influences the 
technologies’ adoption. Although it is fairly 
clear who owns the intellectual property for 
ZENs and TALENs, the situation with CRISPR- 
Cas9 is much less certain. This is particularly 
important for commercial development. “If 
you are a company it may come down to intel- 
lectual property,” says Keith Joung, a patholo- 
gist at Harvard Medical School in Boston, 
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Massachusetts, who both develops genome- 
editing technologies and applies them in medi- 
cal research. The key patents covering the use 
of CRISPR-Cas9 as a genome-editing tool for 
mammalian cells were awarded to the Broad 
Institute of Massachusetts Institute of Technol- 
ogy and Harvard beginning in April 2014, based 
on research by bioengineer Feng Zhang. Zhang's 
group at the Broad Institute — as well as that 
of George Church, a geneticist at Harvard (see 
page S7) — were the first to show in 2013 that 
CRISPR-Cas9 works in human cells'””’. The 
Broad Institute fast-tracked its patent applica- 
tions, leapfrogging those filed earlier in 2012 by 
the University of California. The latter are sup- 
ported by Charpentier and Doudna’s research 
and are still being examined. In response to the 
patent awarded to the Broad Institute, the Uni- 
versity of California has initiated ‘interference 
proceedings’ — a patent priority contest — say- 
ing that the use of CRISPR-Cas9 in human cells 
was reasonably self-evident from Charpentier 
and Doudna’s 2012 paper. 

It is a complex situation, but no one has 
time to sit back and wait for a ruling. Both the 
Broad Institute and the University of Califor- 
nia are issuing licences to companies around 
the world. Meanwhile, the innovations keep 
rolling in. In September 2015, Zhang's group 
reported a new CRISPR system that avoids 
using Cas9 altogether. The DNA snipping is 
instead achieved with another enzyme, Cpfl 
(ref. 12). Whether that difference is enough 
to warrant a new intellectual property ruling, 
is unknown. But, given the difficulty of fitting 
Cas9 into an adeno-associated virus vector, the 
potential advantages of Cpfl, which needs a 
much smaller RNA guide than Cas9, are clear. 
And yet, even though Cpfl was only the second 
type of CRISPR-cutting enzyme to be character- 
ized, there are already signs that there are many 
more to come. In late October 2015, Zhang and 
his collaborators published details of three new 
CRISPR enzymes”. Their initial analysis sug- 
gests that the new enzymes have distinct prop- 
erties from Cas9 and Cpfl, and could, therefore, 
further widen the genome-editing toolbox. “All 
Ican say,’ says Charpentier, smiling as she con- 
siders the field’s future, “is the principle of RNA 
programmable enzymes is a very nice one.” m 


Zoé Corbyn is a freelance journalist based in 
San Francisco. 
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PERSPECTIVE 


odern molecular biology arose in the 1970s when research- 
Me realized that they could use bacterial enzymes, which 

evolved to defend bacteria against pathogens, to modify 
DNA in other organisms. That breakthrough initiated an active dis- 
cussion about the safety and ethics of these ‘recombinant DNA tech- 
nologies, and highlighted the importance of transparency and open 
discourse in fostering public trust in the scientific community. Some 
40 years later, we have the latest evolution of this technology: CRISPR- 
Cas9. The system makes genome engineering even easier, and in doing 
so opens it up to many more stakeholders. This once again raises fun- 
damental questions about appropriate use of a powerful technology, 
made more urgent by a recent demonstration of human-germline 
editing. At least one thing is clear at this stage — we do not yet know 
enough about the capabilities and limits of the new technologies, espe- 
cially when it comes to creating heritable mutations. 

In response to these fundamental ethical 
questions, the US National Academies of Sci- 
ences, Engineering, and Medicine, Britain's Royal 
Society and the Chinese Academy of Sciences will 
co-sponsor an international summit in December 
to consider the scientific and societal implications 
of genome editing. The issues up for discussion 
span clinical, agricultural and environmenta 
applications, but most attention will focus on 
human-germline editing, owing to the potential 
for this application to eradicate genetic diseases 
and, ultimately, to alter the course of evolution. 

The rapid development and widespread 
adoption of easy-to-use, inexpensive and effec- 
tive genome-editing methodologies has changed 
the landscape of biology. The simplicity of the 
CRISPR-Cas9 system allows researchers and 
students to make precise changes to genomes, 
thereby enabling many experiments that were previously difficult 
or impossible to conduct. For example, CRISPR-Cas9 can be used 
to precisely replicate the genetic basis for human diseases in model 
organisms, leading to unprecedented insights into previously enig- 
matic disorders. The Cas9 enzyme can also be used to precisely alter 
epigenetic signatures, providing a means to manipulate the products 
of transcription without changing the DNA code. Moreover, the tech- 
nology makes it easier to correct genetic defects in whole animals and 
in cultured tissues produced from stem cells — strategies that could 
eventually be used to treat or cure human disease. 

When genomic changes are made in fully developed non-reproduc- 
tive cells, they affect only the treated organism or person and do not 
become heritable. But if genomic changes are made to germ cells such 
as those that develop into eggs or sperm, or to developing embryos, 
the changes are incorporated into the cells of the organism that grows 
from them — including its own germ cells. Hence the changes can be 
passed on to future generations. We know that CRISPR-Cas9 technol- 
ogy works in both non-reproductive cells and germ cells, and in both 
primate and human embryos. The publication of human-embryo edit- 
ing experiments in May (P. Liang et al. Protein Cell 6, 363-372; 2015) 
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Embryo editing needs scrutiny 


Genome- editing presents many opportunities. But the advent of human- 
germiine editing brings urgency to ethical discussions, says Jennifer Doudna. 


by researchers at Sun Yat-sen University in Guangzhou, China, lends a 
sense of urgency to December's meeting. Although those experiments 
were carried out on embryos that could not develop into a baby, the 
study nonetheless underscored the fact that this is a technology that 
could have profound implications for permanent alteration of the 
human genome. 

Opinion on the use of human-germline engineering varies widely. 
Some scientists favour the rapid development of the technology, 
whereas others advise banning it for the foreseeable future. In my 
view, a complete ban might prevent research that could lead to future 
therapies, and it is also impractical given the widespread accessibil- 
ity and ease of use of CRISPR-Cas9. Instead, solid agreement on an 
appropriate middle ground is desirable. In addition, future discussions 
that build on this December's meeting should address other potentially 
harmful applications of genome editing in non-human systems, such 
as the alteration of insect DNA to ‘drive’ certain 
genes into a population. 

As the public conversations proceed, five 
specific steps, which should be taken to ensure a 
prudent path forward, have emerged. 

First, safety: the global community of scientists 
and clinicians needs to adopt standard meth- 
ods for measuring genome-editing efficiency 
and off-target effects, so that researchers find 
it easier to compare and evaluate the results of 
different experiments for clinical relevance. 
Second, communication: the December sum- 
mit should stimulate further forums in which 
experts from the genome-editing and bioethics 
communities provide information and educa- 
tion for the public about the scientific, ethical, 
social and legal implications of human-genome 
modification. Third, guidelines: there should be 
international cooperation by policymakers and scientists to determine 
a shared path forward and to provide clear guidance about what is 
and is not ethically acceptable research. Fourth, regulation: out of this 
cooperation, appropriate oversight should be organized and applied 
to laboratory work that aims to evaluate the efficacy and specificity of 
genome-editing technologies in the human germ line. And fifth, cau- 
tion: human-germline editing for the purposes of creating genome- 
modified humans should not proceed at this time, partly because of 
the unknown social consequences, but also because the technology 
and our knowledge of the human genome are simply not ready to do 
so safely. 

The December summit is an important opportunity for China, the 
United Kingdom and the United States to lead the global discussion, 
and for the genome-editing community to renew its commitment — 
which began more than 40 years ago — to wholeheartedly engage 
with the public. = 


Jennifer Doudna is a molecular and cell biologist at the University of 
California, Berkeley. 
e-mail: doudna@berkeley.edu 
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human-germline editing in Washington DC on 1-3 December. 

Now is, therefore, a good time to encourage the general public 
to become well informed on key issues, which may get muddled 
by out-of-date facts or loose phrasing. This technology is poised 
to transform preventive medicine. Rather than talk about the pos- 
sibility of banning alteration of the human germ line, we should 
instead be discussing how to stimulate ways to improve its safety 
and efficacy. I hope to rectify some common misconceptions. 

The potential to alter the human germ line did not arise with the 
discovery of CRISPR-Cas9, nor with other genome-editing tech- 
nologies such as zinc-finger nucleases (ZFNs) and transcription 
activator-like effector nucleases (TALENs). Gene therapy was first 
developed in the 1970s. And even though the term CRISPR-Cas9 
has been used interchangably with gene therapy, none of the cur- 
rent 2,200 gene-therapy clinical trials involve 
this technology — but they do modify the 
genomes of adults and children. There is no 
technical reason why gene therapy could not 
be deployed to alter the human germ line — 
yet almost 80% of countries, including the 
United States and China, have not banned 
such modification’. In fact, germline editing 
can be a by-product of the systemic applica- 
tion of gene therapy to non-reproductive cells. 
A similarly little-recognized point is that the 
DNA of embryonic cells can be edited without 
affecting the germ line. 

Human- germline editing is not special with 
respect to permanence or consent. Replacing 
deleterious versions of genes with common 
ones is unlikely to lead to unforeseen effects 
and is probably reversible. Even if the edit- 
ing was difficult to reverse, this would not be especially unsafe 
compared with other commonly inherited risks. Offspring do not 
consent to their parents’ intentional exposure to mutagenic sources 
that alter the germ line, including chemotherapy, high altitude and 
alcohol — nor to decisions that reduce the prospects for future 
generations, such as misdirected economic investment and envi- 
ronmental mismanagement. 

We already know that germline editing is unlikely to cause 
dangerous, unforeseen mutations. In the best case scenario so far, 
CRISPR-Cas9 seems capable of less than 1 error per 300 trillion 
base pairs’, and techniques to reduce these off-target effects using 
‘CRISPR pairs’ might cut this by many factors of ten. That said, 
the issue is not simply about the number of off-target effects that 
might occur anywhere in the genome, but whether they appear in 
certain genes that, if altered, increase the risk of cancer in a par- 
ticular tissue type. Given that there are about 1,200 of these tumour 
suppressor genes in the human genome, with a target size of about 
3,000 base pairs each, the risk of an unintentional edit in one of 
them is a million times lower than for the genome as a whole. Using 
one altered germ cell rather than a billion somatic cells is very likely 
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| Encourage the innovators 


[\ Rather than emphasize risks that are not entirely new, talks about germline 
/ editing should focus more on the benefits, argues George Church. 


to be a billion times less risky because each of the billion cells has 
an independent chance to add to the risk of initiating cancer. 

Meanwhile, human-germline editing is needed because alterna- 
tive methods for preventing the transmission of inherited diseases 
are problematic. Prenatal genetic diagnosis during in vitro fertili- 
zation (IVF) is often put forward as an alternative to editing. But 
this does not offer a solution for someone who has two copies of 
a deleterious, dominant version of a gene nor for potential par- 
ents who both have two copies of a harmful, recessive version of a 
gene. This is a bigger problem than the population frequencies of 
such genes suggest — marriage between blood relations is a deeply 
rooted social trend among one-fifth of the world’s population’. 

Those who want to ban human-germline editing should also 
consider that such a move would do little to allay concerns about 
ethically dubious attempts to ‘enhance’ humans. To think that there 
is not already a cadre of IVF clinicians poised 
to engage in such practices, perhaps even 
supported by governments, is to ignore, for 
example, the history of doping in sport. These 
kinds of ambitious individuals and institutions 
are unlikely to be dissuaded by an agreement 
made on their behalf by others with a differ- 
ent view. 

Finally, the concept of a ban on germline 
editing does not make sense. There is already 
a ban on using medical technologies in 
humans until they are proven safe and effec- 
tive in appropriate animal trials. Then, follow- 
ing human trials, they can only be applied to 
the general population for those conditions 
for which their use has been demonstrated. 
Banning human-germline editing could put 
a damper on the best medical research and 
instead drive the practice underground to black markets and 
uncontrolled medical tourism, which are fraught with much 
greater risk and misapplication. Instead, the generally high safety 
and efficacy standards of regulatory agencies should be encour- 
aged rather than saddled with pessimistic assumptions about the 
trajectory of promising approaches. 

The genome-editing community can effectively encourage 
researchers to pursue innovative technologies and to improve the 
safety and efficacy of the new tools. And, as discussion of germline 
editing becomes more mainstream, we should learn how to better 
address the concerns of those who are unfamiliar with the tech- 
niques so that the benefits, as well as the risks, are clear to them. m 


George Church is a geneticist at Harvard Medical School in Boston, 
Massachusetts. 
e-mail: gmc@harvard.edu 
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HIV (artist illustration) could be kept at bay by editing the DNA of immune cells. 


Closing the door on HIV 


Although yet to complete clinical trials, genome editing has already shown promise against 


a globally important disease. 


BY MICHAEL EISENSTEIN 


Sceptical is an understatement for Jim Riley's 
first thoughts when, ten years ago, he learned 
that scientists at Sangamo BioSciences wanted 
to use genome-editing technologies to treat 
patients with HIV. “I thought they were 
insane,’ recalls Riley, a microbiologist at the 
University of Pennsylvania in Philadelphia. “I 
thought there was no way you could do this 
at a high-enough efficiency to have a really 
meaningful effect.” 

What the Sangamo researchers were plan- 
ning was remarkable indeed. Their goal was not 
merely to control the symptoms of HIV/AIDS, 
but to directly modify the genes of adults who 
were HIV positive to eliminate their susceptibil- 
ity to the virus. One of HIV’s primary means of 
entering immune cells, including helper T cells 
and macrophages, entails latching onto a cell- 
surface protein called C-C chemokine receptor 
type 5 (CCRS5). A small percentage of people 
— roughly 10% of those of European descent 
— carry a deletion that removes 32 nucleotides 
from the gene that encodes CCRS5. The result- 
ing receptor is truncated and impossible for the 
virus to grasp. This means that homozygous 
individuals — those who inherited the mutation 
from both their mother and their father — are 


essentially resistant to the most commonly 
transmitted strain of HIV. 

To replicate this desirable trait, scientists at 
Sangamo, a biopharmaceutical company based 
in Richmond, California, have been working 
closely with academic researchers across the 
United States, including — once he overcame 
his initial surprise — Riley and his team at the 
University of Pennsylvania. The project uses 
one of the more established tools of genome 
engineering, zinc-finger nuclease (ZFN) tech- 
nology. Sangamo’ product, SB-728, contains 
a set of engineered protein parts called zinc 
fingers that bind to specific sites within the 
CCRS gene. These zinc fingers are linked to 
a nuclease enzyme that can cut the DNA. In 
2008, Riley's team showed that SB-728 is capa- 
ble of efficiently and specifically snipping out 
a chunk of the CCR5 gene in cultured human 
T cells (E. E. Perez et al. Nature Biotechnol. 26, 
808-816; 2008). 

These findings offered tantalizing proof of 
concept that such editing might provide real 
protection for patients. 


BERLIN AND BEYOND 

There is a medical precedent for thinking that 
this approach will work against HIV. Back in 
the 1990s, US student Timothy Ray Brown 
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became infected with the virus while studying 
in Berlin, Germany. About a decade later, he 
developed acute myeloid leukaemia. Things 
got even worse when his first two courses of 
chemotherapy, given to treat the leukaemia, 
caused his kidneys to fail. So doctors discon- 
tinued his antiretroviral drugs, which meant 
that his viral load started to climb. Yet, remark- 
ably, it was this combination of leukaemia and 
HIV that proved to be Brown's salvation. 

In 2007, he received a stem-cell transplant 
at Charité, a large teaching hospital in Berlin. 
The blood stem cells that Brown received 
were carefully chosen for him. Normally, 
doctors verify only that the tissues of the 
donor and the recipient match — for blood 
stem cells they check a marker called human 
leukocyte antigen — but in Brown's case, the 
medical team also screened potential donors 
homozygous for the CCR5 mutation. After 
radiation therapy, the blood stem cells that 
Brown received, and from which his T cells 
developed, were therefore immune to HIV. 
After a few rounds of treatment, Brown was 
soon in remission. His 


T-cell levels rose, and DNATURE.COM 
he has remained disease For more on genome 
free without the need for _ editing for HIV visit: 


antiretroviral drugs. go.nature.com/kjakiv 
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Brown's recovery was inspi- 
rational for researchers con- 
templating CCR5 as a target for 
genome editing. “There aren't 
many genes that I’m aware of 
where knocking them out doesn’t 
do any harm, but instead has a 
therapeutic benefit” says Paula 
Cannon, a specialist in gene 
therapy and infectious disease at 
the University of Southern Cali- 
fornia in Los Angeles, who began 
her research of ZFNs as a tool for 
modifying CCR5 in 2007. 


INTO THE CLINIC 

The early success of SB-728 in rep- 
licating the CCR5 mutation, cou- 
pled with the story of the Berlin 
patient, as Brown became known, 
made researchers optimistic for 
clinical trials of the therapy. Between 2011 and 
2013, researchers at the University of Pennsyl- 
vania, including immunotherapist Carl June 
and HIV specialist Pablo Tebas, used SB-728 to 
modify the genomes of helper T cells (the main 
target of HIV) obtained from 12 volunteers 
who were HIV positive. The researchers then 
cultivated the cells and transplanted them back 
into the donors. All the patients experienced 
a boost in their T-cell count, and each patient 
established a small, but stable subpopulation of 
immune cells with edited CCR5 genes. When 
treatment with antiretroviral drugs was inter- 
rupted to test whether the gene edits worked on 
their own, some patients saw transient reduc- 
tions in their viral load (P. Tebas et al. N. Engl. 
J. Med. 370, 901-910; 2014). “The take-home 
for me was that the engineered cells got into 
patients and lasted longer than were expected,” 
says Cannon, who was not directly involved in 
the study. 

The next challenge was how to make this 
immune protection more potent and durable. 
One approach is to generate a larger popula- 
tion of ZFN-modified T cells. So, in a separate 
study, three patients were given a mild dose 
of chemotherapy to reduce their immune- 
cell populations before transplantation. As 
an added boost to the therapy, in addition to 
editing helper T cells, the researchers also used 
SB-728 to modify killer T cells, which can also 
be destroyed by HIV infection. 

“By creating a little more space and relying 
on the homeostatic factors that maintain T-cell 
levels, the cells have a better chance of survival 
and of giving rise to long-lasting cell popula- 
tions, explains Riley. Two of the three patients 
experienced a profound drop in viral load, and 
they have not had to take antiretroviral treat- 
ment for more than a year. 

How many cells must have their CCR5 
genes edited to keep HIV at bay is not clear, 
however. About 5% of circulating T cells 
were successfully edited in the most recent 
trials, but Cannon points out that there are 


Timothy Ray Brown is disease-free after receiving HIV-immune blood cells. 


also populations of T cells hiding in tissues, 
which makes the total pool a lot bigger than 
estimates from circulating cells would suggest. 
A fully modified T-cell population would be 
a tall order, but it may be possible to achieve 
protection even with a relatively small pro- 
portion of edited cells, according to Hans- 
Peter Kiem, a gene-therapy researcher at the 
Fred Hutchinson Cancer Research Center in 
Seattle, Washington. Kiem’s group uses pri- 
mate models to study the clinical potential 
of genome-edited immune cells. “If we only 
protect about 20% of the cells, we get a very 
robust boost in the immune response against 
HIV,” says Kiem, referring to a 2013 study in 
which he tested the extent to which geneti- 
cally modified stem cells protect pig-tailed 
macaques from simian HIV. 


LOOKING AHEAD 

Kiem thinks that the critical factor for building 
immunity against HIV is engraftment — the 
extent to which transplanted cells incorporate 
themselves into the tissues of the recipient’s 


body. He and Cannon 

are separately explor- “The take-home 
ing whether SB-728 forme was that 
might perform bet- the engineered 
ter ifitis appliedto cells got into 
haematopoietic stem patients and 
cells —the common lasted longer 
precursor ofallofthe than were 
various blood and expected.” 


immune cell sub- 
types — rather than to a few varieties of fully 
developed immune cells. “Then we can hit the 
T cells as well as monocytes, macrophages and 
other cell types that can be infected by, or serve 
as reservoirs for, HIV,’ explains Kiem. 
However, stem cells are more difficult 
to cultivate and edit than T cells, and must 
be carefully maintained to ensure that 
they retain their developmental flexibility. 
Using stem cells also means more serious 
side effects for patients, who will have to 
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undergo an aggressive course of 
chemotherapeutic ‘condition- 
ing’ before treatment. “It kills 
some of the stem cells in the 
bone marrow to make room for 
the engineered cells — and it’s 
not a trivial thing to undergo,” 
cautions Cannon. This strategy 
is also slower to have an effect: 
it takes between six months and 
a year before the stem cells fully 
replenish the mature T-cell popu- 
lation. Cannon is involved with 
a newly launched clinical trial 
at the City of Hope Hospital in 
Duarte, California, which aims 
to explore how well these cells 
engraft into the bone marrow of 
12 patients with HIV, and how 
many HIV-proof immune cells 
they each produce. 

The therapeutic landscape for HIV has 
changed significantly in the ten years since 
Sangamo began pursuing this project. For a 
start, many patients can now keep their viral 
loads in check indefinitely by taking stand- 
ard antiretrovirals. Nonetheless, a significant 
minority do not respond to these drugs. 

Dale Ando, Sangamos chief medical officer, 
is intrigued by the potential for a ‘one-hit treat- 
ment as opposed to having to take lifelong 
medication. “With antiretroviral therapy, there 
is a significant toll on the brain and heart, and 
increased risk of cancer, as well as chronic 
inflammation from long-term HIV infection,’ 
he says. By comparison, and leaving aside the 
effects of the associated chemotherapies, 
SB-728 has not been linked to any serious 
side effects. So far, all the data on SB-728 have 
assuaged the most immediate concerns about 
ZFNs — that off-target edits elsewhere in the 
genome may have damaging or carcinogenic 
consequences. 

Perhaps more importantly, these HIV studies 
have helped to clear a regulatory path for future 
genome-editing therapeutic programmes. 
“We've had multiple discussions on T cells, stem 
cells and in vivo genome editing, so the US Food 
and Drug Administration (FDA) is quite com- 
fortable, says Ando. As mainstream attention 
shifts to another genome-editing technology, 
CRISPR-Cas9, many believe that the FDA will 
find itself on familiar turf when drug applica- 
tions that use the newer tool are filed. 

From Cannon’s perspective, much of the 
credit for this rapid progress belongs to the 
HIV patient community, whose political activ- 
ism and hunger for a cure has helped to push 
genome editing into the clinic. “They've got- 
ten us to this stage with this new therapy very 
quickly,’ she says, “and hopefully it will have 
benefits for all sorts of other diseases in the 
future.” m 


Michael Eisenstein is a freelance writer based 
in Philadelphia, Pennsylvania. 


3 DECEMBER 2015 | VOL 528 | NATURE | S9 


© 2015 Macmillan Publishers Limited. All rights reserved 


* 


Expanding 
possibilities 
The first therapeutics based on genome- editing tools will 


treat diseases caused by single genes, but many other 
factors dictate what is currently possible. 


BY VIRGINIA GEWIN 


tinal stem cells from two children with 

cystic fibrosis, a disease that results in 
thick, sticky mucus and affects the lungs and 
other organs, including the intestines. He used 
these stem cells to grow gut tissue that he calls 
‘miniguts, and introduced a healthy version of 
the gene that is disrupted in people with cystic 
fibrosis. This was one of the first attempts to 


E late 2013, Hans Clevers isolated intes- 


show that CRISPR-Cas9, a gene-editing tool 
that has since received a huge amount of atten- 
tion, can repair human tissue. The results were 
impressive: the faulty gene was corrected in 
about half of the miniguts that Clevers tested. 

Clevers, a molecular geneticist at the 
Hubrecht Institute in Utrecht, the Nether- 
lands, is still amazed by the success. “It is 
remarkable how well CRISPR works,” he says. 
“Tve never seen anything — apart from PCR 
— that was so simple and so powerful.” PCR, 
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or polymerase chain reaction, is effectively a 
way of photocopying DNA and has become an 
essential tool for geneticists. 

The work by Clevers helped to make the 
case that CRISPR-Cas9 is not just a tool of 
basic science, but a source of medical break- 
throughs to come. The CRISPR craze is now 
in full swing, and the platform's ability to treat 
a range of diseases — from severe combined 
immunodeficiency (SCID) to muscular dys- 
trophy — is being put to the test. Many of 
the scientists involved predict that its medi- 
cal applications will rapidly outstrip those of 
the other main genome-editing tools, such as 
transcription activator-like effector nucleases 
(TALENs) and zinc-finger nucleases (ZFNs), 
because CRISPR-Cas9 is more efficient and 
easier to use. 

The consensus is that monogenic diseases — 
those involving only one gene — are the low- 
hanging fruit of the field. But even the most 
ardent genome-editing enthusiasts say that 
this term is misleading. “The fruit is still pretty 
far up the tree,” says Chad Cowan, a stem-cell 
biologist at Harvard University and co-founder 
of CRISPR Therapeutics, a biotech company 
based in Cambridge, Massachusetts, set up to 
use CRISPR-Cas9 to cure diseases. 

There are many factors that determine 
whether genome editing is a viable approach 
for a particular disease. The main difficulty 
— the one that dictates which diseases are 
plausible targets for therapeutics — is deliv- 
ering the therapy, and this strongly depends 
on the ability to access the cells or organs that 
need correction. But many characteristics 
guide researchers in prioritizing their efforts. 
The percentage of cells whose genomes must 
be edited to achieve a medical benefit is one 
important factor, as is whether treating the 
affliction requires deleting, introducing or 
correcting genes. 


REMOVING THE PROBLEM 

The delivery hurdle is so substantial that 
researchers are trying to work around it, 
rather than overcome it. One strategy is to 
extract cells, edit their genomes, check that 
there are no unintentional genetic changes, 
known as ‘off-target effects, and then rein- 
troduce them to the body so that they can 
operate as healthy cells. This approach is 
particularly promising for problems of 
blood and bone marrow, including HIV (see 
page S8) and sickle-cell disease. 

Efforts are already underway to develop 
CRISPR-Cas9 treatments to tackle sickle- 
cell disease. One of the painful symptoms 
of the disease is caused by misshapen 
blood cells clogging the blood vessels, and 
researchers hope that 
gene editing could 


offer atreatment,ifnot Toreadmoreon 
acure. The target isa monogenic disorders 
gene called BCL11A, _ visit: 


which causes red blood 
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cells to produce adult, rather than fetal, hae- 
moglobin. Fetal haemoglobin does not form 
long chains, so tricking cells into producing 
it could result in less clogging by red blood 
cells. A team of researchers including Feng 
Zhang of the Broad Institute in Cambridge, 
Massachusetts, recently showed that using 
CRISPR-Cas9 to make cuts in the genomic 
region that controls the expression of 
BCL1I1A increases the production of fetal 
haemoglobin’. 

Sickle-cell disease is a straightforward 
target, even among monogenic diseases 
without problematic delivery, and viral dis- 
eases of the blood can be tackled in a similar 
way. Cowan's group has successfully used 
CRISPR-Cas9 to disable the CCR5 gene in 
half of the blood stem cells treated’. This 
is important because HIV uses the CCR5 
receptor to enter cells (see page S8). Apply- 
ing this approach to bone-marrow cells 
could effectively immunize people against 
the virus, he says. 

“Inactivating a disease-causing gene is 
a whole lot easier than correcting a gene,” 
says Erik Sontheimer, a biologist at the Uni- 
versity of Massachusetts Medical School in 
Worcester and co-founder of Intellia Thera- 
peutics, based in Cambridge, Massachusetts, 
which also develops CRISPR-Cas9-based 
treatments. However, there are not many 
heritable diseases that can be fixed by sim- 
ply knocking a gene out, adds Bryan Cullen, 
a molecular geneticist at Duke University in 
Durham, North Carolina. 

That said, diseases with a strong genetic 
component in which there is one healthy 
and one mutant gene variant are good can- 
didates for this kind 
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says David Segal, a 
genome researcher at 
the University of California, Davis. Segal thinks 
that Huntington's disease, a neurodegenerative 
disorder caused by a single mutation, is a prime 
example of a disease that could benefit from 
this approach. The only problem — and it is a 
significant one — is that the cells in need of cor- 
rection are not readily accessible because they 
are found in the brain. 

The eye, however, presents a much easier 
target. Tara Moore, a molecular biologist at 
Ulster University in Belfast, UK, is using this 
strategy to treat Meesmann’s epithelial cor- 
neal dystrophy, a heritable disease resulting 
in cysts on the cornea that can cause irrita- 
tion and blurred vision. She has successfully 
used CRISPR-Cas9 to find and disable gene 
variants that cause the disease, leaving the 
healthy allele intact in cornea-generating 
stem cells’. She estimates that this approach 


Blood cells are misshapen in sickle-cell disease. 


could work in roughly one-third of the 
76 mutations that are known to cause cor- 
neal disorders, of which Meesmann’s is just 
one. These 76 mutations are spread among 
just four genes. “The eye is so accessible, 
and is such a small area to treat,” she says. 
“And we're able to clearly monitor it to note 
improvement.” 


INSERTING A SOLUTION 

An alternative to disrupting a gene is to intro- 
duce one, but that requires getting a DNA 
template to the site of genome editing (see 
page S2). This has been achieved for the liver, 
as a treatment for type I tyrosinaemia demon- 
strates. Those with this disease have a faulty 
gene called FAH that reduces their ability to 
break down the amino acid tyrosine, result- 
ing in liver damage. Last year, scientists at the 
Massachusetts Institute of Technology used 
CRISPR-Cas9 to insert a healthy version of 
the FAH gene into the liver cells of laboratory 
mice. The healthy gene was expressed in only 
1 of every 250 liver cells, but this was enough 
to reduce liver damage’. 

Perhaps not surprisingly, diseases that can 
be alleviated by editing only a small percent- 
age of cells are among the first to be targeted. 
Like type I tyrosinaemia, SCID falls into this 
category, and it has a delivery advantage: cells 
with a healthy or corrected gene sequence pro- 
liferate once they are put back into the body. 
By comparison, most cancers probably require 
all of the relevant genes to be edited to stop 
the disease from rebounding. Other diseases 
in which correcting a small percentage of cells 
might make a big difference include glycogen 
storage disease and ornithine transcarbam- 
ylase deficiency, an inherited disorder that 
causes ammonia to accumulate in the blood. 
But researchers might be hesitant — ornithine 
transcarbamylase deficiency was at the centre 
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of a gene-therapy trial in 1999 that was shut 
down after a patient died. 

In comparison with CRISPR-Cas9, the 
older genome-editing tools, TALENs and 
ZFNs, can struggle with the task of correcting 
a gene, especially when it comes to achieving a 
normal level of gene expression from the edit. 
Sometimes overexpressing a gene can increase 
the risk of cancer. In this regard, says Cowan, 
“CRISPR-Cas has a defining and significant 
advantage over other gene-editing techniques.” 

But CRISPR-Cas9 is not always the optimal 
choice of the genome-editing tools available. 
When success comes down to size, for example, 
ZFENs often have a distinct advantage. Adeno- 
associated viral vectors, which are promis- 
ing delivery systems, especially to the liver, 
can accommodate a ZFN as well as an engi- 
neered gene template’, but sometimes strug- 
gle to squeeze in the larger genome-editing 
tools, TALENs and CRISPR-Cas9. Moreover, 
CRISPR-Cas9 is known to have less inherent 
specificity than TALENs with their long DNA 
recognition domains — a particular concern 
for the editing of large and complex genomes 
— although researchers have the made great 
gains in reducing these ‘off-target’ effects. 

Genome editors ultimately hope to tackle 
diseases of all levels of genetic complexity 
and affecting all parts of the body. But at the 
moment, even the therapies that are closest to 
approval are still a good way off. “Five years 
would be an aggressive timeline, but it could 
happen,’ says Cowan. 

This does not stop researchers from dream- 
ing. Segal, for example, suggests that the brain 
presents probably the most formidable chal- 
lenge for delivering genome-editing therapies. 
But he is well aware that there are many single- 
gene neurological disorders: Angelman syn- 
drome, Huntington's disease and Prader-Willi 
syndrome, to name just three. 

In the short-term, the focus on diseases of 
the eye, blood and liver, which are the easi- 
est organs to target with CRISPR-Cas9, will 
continue. Since his early success with cystic 
fibrosis, Clevers has started working on the 
liver, primarily because knowledge of how to 
transplant corrected stem cells — a key step in 
many putative genome-editing therapies — is 
more advanced for the liver than for the lung. 

Cullen has the same instinct. “My bet is that 
the first successes of CRISPR-Cas9 treatments 
will involve diseases in the liver,’ he says. “The 
liver is where everything goes, whether you 
want it to or not.” = 


Virginia Gewin is a freelance science writer 
based in Portland, Oregon. 
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EPIGENETICS 


The genome unwrapped 


Epigeneticists are harnessing genome - editing technologies to tackle a central question 


hanging over the community — 


BY HEIDI LEDFORD 


n 18 February, a consortium of more 
() than 90 laboratories published a 

landmark catalogue of the chemical 
changes to DNA that are thought to influence 
whether and how genes are expressed. Called 
the Roadmap Epigenomics Project and spon- 
sored by the US National Institutes of Health, 
the compendium offered an unprecedented 
look at the layers of coding that exist on top 
of the genetic code — collectively known as 
the ‘epigenome’ — in 127 different human 
tissues and cell types. The US$154-million 
project was viewed as a crucial step towards 
determining how this 
chemical code con- E.COM 


To read a special 


tributes to human 
health and disease. As onthe epigenome 
researchers get to grips 


roadmap visit: 
with the catalogue’s — go.nafure.com/knjufe 


does their field matter? 


contents, the project is also likely to provide 
a leap forward in pinning down one of the 
central mysteries of biology: how do cells 
with the same genetic instructions take on 
wildly different identities? 

It is still unclear what that epigenetic 
code actually does, and how it is generated. 
“I don't think it can be overstated how lit- 
tle we understand about how the epigenome 
works,” says Charles Gersbach, a biomedi- 
cal engineer at Duke University in Durham, 
North Carolina. “There are all of these epige- 
netic marks and we dort know what they are 
doing. Are they even necessary?” 

After years of wondering, biologists such 
as Gersbach are now in a position to find out. 
By harnessing genome-editing technologies, 
they are able to interrogate the epigenetic 
control of gene expression with remarkable 
power and specificity. Researchers can make 
or delete epigenetic marks at will, and home 
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in on RNAs and proteins that could play a 
hitherto unrecognized part in directing gene 
expression. And with these new capabilities, 
they hope to build an answer to a key ques- 
tion that has plagued the field of epigenetics 
since its inception — do epigenetic marks 
alter gene expression or do changes in gene 
expression alter the marks? “It’s an absolutely 
legitimate question and we need to address 
it,” says Luca Magnani, a cancer researcher 
at Imperial College London. “The answer is 
either going to kill the field, or make it very 
important.” 


A THICKET OF COMPLEXITY 

Epigenetics is not for the faint of heart. 
Where the genetic code offers simplicity and 
stability, with its four bases of DNA, passed 
down stably from one generation to the next, 
the epigenetic code is gnarly and dynamic. 
Dozens of different chemical modifications 


KYLE BEAN 


decorate both the DNA and the histones — 
proteins that package the DNA into chro- 
mosomes. All of these marks can vary from 
cell to cell, influenced by age, developmen- 
tal stage and the environment. “One of the 
biggest challenges in this area is knowing 
whether what you've observed is general- 
izable or if it’s specific to that gene or that 
cell type or the culture conditions or the day 
of the week or the cycle of the moon,” says 
Gersbach. 

Getting to grips with such complexity could 
generate huge payoffs, and not just for basic 
research. For example, when scientists alter 
epigenetic networks, they can coax stem cells 
to take ona new identity — and perhaps, in the 
future, this could be used to treat disease. Simi- 
larly, genomic studies frequently point to the 
important role that the full collection of epige- 
netic patterns in a cell nucleus has in complex 
diseases such as diabetes or schizophrenia, 
notes Tim Reddy, a genomics researcher also 
at Duke University. “In a lot of these cases, it 
really seems to be not a DNA mutation that 
impacts the protein sequence, but a change in 
how genes are regulated,’ he says. Targeting 
these regulatory elements and altering their 
activity could one day yield a new approach 
for treating complex diseases, he adds. 

Drugs that are thought to work by modify- 
ing the epigenome are already on the mar- 
ket in places such as the United States and 
Europe. Some inhibit enzymes that either add 
or remove acetyl groups from histones, and 
can treat a range of conditions from epilepsy 
to cancer. Whereas other drugs treat cancer 
by blocking enzymes that remove methyl 
groups from DNA. 

But froma scientific standpoint, the prob- 
lem is that no one knows exactly which epi- 
genetic alterations lie behind these drugs’ 
effectiveness. The drugs act globally over the 
entire genome, rather than being directed to 
any specific location, which makes it impos- 
sible to use them to determine the function 
of individual, or even regional, epigenetic 
changes. Some researchers even view the 
tolerability of the side effects of these lib- 
erally acting drugs (including those that 
inhibit enzymes called histone deacetylases, 
or HDACs) as suggesting that some epige- 
netic marks are not important in regulating 
gene expression. “If you can just eatan HDAC 
inhibitor, then exactly how important is that 
enzyme?” says Gersbach. “It’s clear that we 
don't understand it very well” 


ACRISP, NEW DAWN 

The enzyme components of genome-editing 
technologies offer a way forward because they 
allow researchers to focus on a single region 
of DNA. Before the editing system CRISPR- 
Cas9 became widely used, researchers targeted 
the epigenome by altering the FokI enzyme 
— the enzyme involved in the editing tech- 
nologies zinc finger nucleases (ZFNs) and 


transcription activator-like effector nucleases 
(TALENs). The first step was to disable Fokl’s 
capacity to cut DNA, without removing ZFNs 
and TALENs ability to home in on a target 
sequence. The incapacitated enzyme was then 
attached to another enzyme that could make or 
remove epigenetic marks. The outcome was an 
epigenetic enzyme targeted to a specific loca- 
tion in the genome — or, put another way, a 
chance to interrogate the function of specific 
epigenetic changes. 

But ZFNs and TALENs can be difficult to 
work with, and results from experiments that 
use them have been slow to trickle in. ZFNs 
are also prone to creating unwanted, off-target 
alterations to the epigenome, notes Tomasz 
Jurkowski, a biochemist and epigeneticist at 
the University of Stuttgart in Germany. “You 
could not reach a final conclusion — maybe 
what you were seeing were secondary effects 
from somewhere else,” he says. 

Earlier this year, researchers reported that 
CRISPR-Cas9 could be adapted to do the 
same thing, but with less effort and uncer- 
tainty (N. A. Kearns et al. Nature Meth. 12, 
401-403; 2015). René Maehr, an immu- 
nologist at the University of Massachusetts 
Medical School in Worcester and his col- 
leagues fused an enzyme called histone dem- 
ethylase, which removes methyl groups from 
histones, to a deactivated Cas9 enzyme, and 

then programmed 


“Tf you canjust it to target regions 
eat an HDAC of DNA believed to 
inhibitor, then enhance the expres- 
exactly how sion of certain genes. 
importantisthat The result was a 
enzyme?” functional map of 


genetic ‘enhancer’ 
sequences that allows researchers to deter- 
mine what these enhancers do, how strongly, 
and — most importantly — where they are 
located in the genome. 

Meanwhile, Gersbach, Reddy and their col- 
leagues coupled an inactive Cas9 to an enzyme 
called an acetyltransferase, which attaches acetyl 
groups to histones — a process that is thought 
to turn genes on (I. B. Hilton et al. Nature Bio- 
technol. 33, 510-517; 2015). Reddy says that he 
was surprised at the extent to which the expres- 
sion of a target gene increased when a histone 
inan enhancer region was acetylated, given the 
uncertainty as to whether DNA marks are a 
cause or a consequence of such activation. “That 
result started to convince me that the acetylation 
of histones may be a direct cause of gene activa- 
tion,’ he says. 

But it will take many such studies before 
the community knows whether that result 
applies to other epigenetic marks. It is pos- 
sible that some marks cause changes to gene 
expression, whereas others could merely 
be an effect of a change to gene expression. 
“It’s hard to put it all into some neat pack- 
age,’ says Steven Henikoff, a geneticist at the 
Fred Hutchinson Cancer Center in Seattle, 
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Washington. “For all we know, they might 
have very minor effects on gene expression 
except in a few special cases.” 


THE WAY AHEAD 

Now, however, researchers have a tool to pick 
apart the detail. Because of its simplicity and 
versatility, CRISPR-Cas9 opens up an oppor- 
tunity to launch the kind of large-scale projects 
needed to reach that level of understanding. “If 
we want to target a region in the genome, we can 
have that targeting molecule here tomorrow for 
five dollars,” says Reddy. “We're going to get to 
march through every single one of these modi- 
fications and figure out what they actually do” 

There will still be technical hurdles to over- 
come, cautions Gersbach. For example, the 
enzymes needed to make or erase epigenetic 
marks sometimes lose their activity when they 
are tacked on to inactive FokI. And, as epige- 
neticist Marianne Rots of the University of 
Groningen in the Netherlands notes, Cas9 is 
relatively large as proteins go. As a result, it can 
have trouble accessing stretches of DNA that are 
especially tightly wound. 

Despite this, there is still plenty of room for 
ambitious projects. Jeremy Day, a neuroscientist 
at the University of Alabama in Birmingham, 
is using CRISPR-Cas9 to study the long-lasting 
epigenetic changes associated with addiction 
that occur in the brain. His aim is to use recently 
described systems in which light activates 
CRISPR-Cas9. This would allow him to control 
where and when an enzyme adds or removes 
any given epigenetic mark. For Day, this advance 
means that the marks of addiction on the brain 
could one day be reversed, without hindering 
the ability ofa patient to feel pleasure in response 
to other stimuli. “You don't want to just deaden 
people,’ he says. “With these very specific tools 
we can find out the critical modifications that 
perpetuate addiction.” And, more broadly, for 
the field of epigenetics, this light activation tech- 
nology offers a kind of revolutionary power. “It 
will allow us to learn alot about the basic biology 
of those epigenetic marks: how long do they last? 
How much of that modification do you need to 
affect the gene?” Day adds. 

Although any therapeutic application of 
CRISPR-Cas9 to epigenetics is still in the dis- 
tant future, the rapid pace of the field is already 
defying expectations. Jurkowski, for one, started 
his lab in 2012 just before the first papers show- 
ing CRISPR-Cas9 genome editing in human 
cells were published. Like many researchers, 
Jurkowski then took up CRISPR research in 
2014, but has been scooped twice by competing 
labs in less than two years. He takes the competi- 
tion in stride — it is the price of entry into the 
fast lane. Epigenetics is on the verge of a revo- 
lution, he says. “This is just the beginning,” he 
says. With just a little more time, “It will develop 
into a completely new field” = 


Heidi Ledford is a senior reporter for Nature 
in Cambridge Massachusetts. 
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Tim Lu 
Cocktail maker 


Tim Lu’ synthetic-biology research at Massachusetts Institute of Technology in Cambridge 
combines biological engineering with electronics and computer science to create bacteria that 
make structural proteins containing tiny semi-conductors called quantum dots. He explains how 
genome-editing techniques are furthering his research and their role in treating disease. 


How do you use genome editing in synthetic 
biology? 

Put simply, new editing technologies allow us 
to make genetic edits very efficiently. One of 
synthetic biology’s main focuses is to repro- 
gram DNA to achieve new functions inside 
living cells. Modifying DNA used to be quite a 
labour-intensive process, but that has changed. 
We spend a lot of our time iterating our designs 
and improving them, so the faster we can turn 
the crank, the faster we can converge on some- 
thing that actually works. The range of cells 
that we can modify has also greatly expanded 
with new genome-editing tools to encompass 
animal, plant and bacterial cells, increasing the 
scope of applications. 


Does your research have clinical applications? 

Yes. The goal is to endow cells with basic 
computing ability. By making cells that can 
sense their environments and take decisions 
based on the signals they detect, we hope to 
create new diagnostics and therapies. These 
days you go to the doctor, get a diagnosis, 
and then pop a pill with no control over it 
after you swallow. But what if something 
you swallow could sense disease indicators 


and respond with treatment before you 
became sick? 


That sounds amazing. But how exactly would 
these disease-sensing pills work? 

The idea is to edit organisms to turn them 
into sensors that record what goes on inside 
the complex environment of the gut. In other 
words, to create bacteria that can tell whether 
there are signals of disease such as inflamma- 
tion. If you ate these bacteria, they could then 
be recovered from your faeces to provide infor- 
mation about what happened as they transited 
through you. Bacteria could be engineered to 
not only sense their environment, but also to 
produce some sort of therapeutic molecule, so 
that they could deliver a drug only where it is 
needed. 


Do you foresee genetically edited bacteria 
becoming part of the human microbiota? 
Their first role is more likely to help us to better 
understand how this community of organisms 
contribute to health and disease. Microbiome 
studies are primarily just surveys. Research- 
ers take faecal samples, sequence the bac- 
teria in the sample, and see what species are 
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there. From that they derive some interesting 
hypotheses that link certain bacteria to par- 
ticular diseases. But missing from these stud- 
ies is a functional understanding of what the 
microbes are actually doing. What ifa microbe 
is only 0.5% of the gut microbiome, but has 
some really important function? 

In my lab, we have done a lot of work on 
targeted antimicrobials. For example, we engi- 
neer bacteria-invading viruses called bacterio- 
phages as ways of killing very specific bacteria 
or delivering genetic information into them. 
If you were to knock out one species at a time 
from a microbiome, and saw what effect that 
had on a host, you would get a much better 
understanding of what each member of the 
host’s bacterial community does. 


Do you think bacteriophages will be widely 
used as therapeutics in the future? 

There has been a lot of interest in alternative 
antimicrobials because antibiotic resistance is 
such a big problem. Phages have a part to play 
in the solution, but there are regulatory issues. 
In some Eastern European countries you can 
buy phage products over the counter, even 
though a lot of what is available has not been 
subjected to rigorous clinical trials. 

You often need a cocktail of different types 
of phages to properly target a bacterial spe- 
cies. If you were to approach that by taming 
wild phages, you would often find very dif- 
ferent families that have differently organized 
genomes in your cocktail. The key regulatory 
issue is that you need to make sure that each 
phage in a therapy is consistent within clear 
boundaries of biological variation because 
they all have quite different safety pro- 
files. We have been using synthetic biology 
techniques to create more uniform phage 
cocktails. These phages would work like 
antibodies — they have a common scaffold 
that can be reconfigured to target different 
bacteria. 


Do you have any concerns about these new 
genome-editing technologies? 

There is an emerging movement in which peo- 
ple are setting up shops in their garages. Com- 
munity labs are being set up that allow anyone 
to come in and be trained. Previously, you had 
to be an expert in making zinc-finger vectors 
to edit DNA, but now — because CRISPR- 
Cas systems are so easy to use — anyone with 
molecular biology training can do it. On the 
one hand it is an exciting time for the field 
because this movement is going to bring ina 
lot of new ideas and talent. But on the other, 
it is also going to create new regulatory ques- 
tions. The democratization of biological engi- 
neering is inevitable. Now we have to size up 
the risks and benefits so we can harness what 
is going to come of it. m 


This interview has been edited for length and clarity. 
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Pigs reared at the University of Edinburgh’s Roslin Institute have had individual letters of their genetic code modified to protect them against African swine fever. 


AGRICULTURE 


A new breed of edits 


Genome editing allows much smaller changes to be made to DNA compared with conventional 
genetic engineering. In terms of agriculture, this might win over public and regulator opinion. 


BY CLAIRE AINSWORTH 


crop, a herbicide-resistant oilseed rape, 

was planted in fields dotted across the 
United States. Although the plant's DNA has 
been directly altered by molecular biologists, 
the company that created it, Cibus, based in 
San Diego, California, explicitly markets the 
crop as non-genetically modified (non-GM). 
The company’s argument is that only a few 
nucleotides of the plant’s existing genes have 
been changed. No gene has been inserted from 
a different kind of organism, nor even from 
another plant. 

A lot hangs on how governments around the 
world decide to regulate agricultural products 
that have had their genomes edited. The deci- 
sions will influence the types of edited crops 
and animal products that are developed. To US 
regulators, Cibus’s oilseed rape is an example of 
mutagenesis, not of genetic modification. This 
is a relief to the company because preparing 
for regulatory approval of a GM organism in 
the United States can take more than five years 
and cost tens of millions of dollars. Europe is 


E spring 2015, the first genome-edited 


even stricter, and the European Commission 
has yet to publish its legal interpretation of 
how genome-edited crops, such as the Cibus 
oilseed rape, should be regulated. Several 
political groups are lobbying for a hard line, 
which would frustrate many researchers. “If 
Europe regulates genome-edited organisms 
in the same way it does GM organisms, it will 
kill the technology here for all except the bio- 
tech companies working with profitable traits 
in the major crops,” says Huw Jones, senior 
research scientist at Rothamsted Research in 
Harpenden, UK, who is currently working on 
genome editing in wheat. 

Yet the potential applications of genome 
editing for global agriculture — and disease 
vectors (see “Hack the mosquito’) — are huge. 
But so are the challenges that the world will 
face. According to projections by the United 
Nations, the world’s population is set to soar 
from the current 7.3 billion to 9.7 billion by 
2050. Agricultural output will have to increase 
to feed more mouths, even though the amount 
of fresh water available for irrigation is decreas- 
ing, and most of Earth’s arable land is already 
under cultivation. Add in the effects of climate 


change — crop-damaging higher tempera- 
tures, drought and flooding, not to mention 
a rise in agricultural pests and diseases — and 
it is no surprise that food security is top of the 
international political agenda. 


DIFFERENT FURROWS 

Genetic modification and conventional breed- 
ing have long been available to assist in meet- 
ing these food-security challenges, but genome 
editing is different, argues Pamela Ronald, a 
plant pathologist at the University of Califor- 
nia, Davis. Genetic engineering is typically 
ham-fisted: it often involves inserting a large 
section of DNA from an entirely different kind 
of organism — often in another kingdom — 
with little control over where in the genome it 
lands. Meanwhile, conventional breeders are 
limited not only by the time it takes to cross in 
new traits, but also by the need to ensure that in 
doing so, they do not breed out the plant's other 
desirable characteristics. 

Compared with these alternatives, genome 
editing offers both subtlety and speed, wher- 
ever in the genome a researcher wants to target. 
“You can change even a single base pair, or you 
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Hack the mosquito 
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The mosquito has long held the title of the 
world’s deadliest animal. The Anopheles 
genus causes hundreds of thousands of 
human deaths annually by transmitting 
malaria parasites. Editing Anopheles 
genomes — as well as those of Aedes 
mosquitoes, which spread viral infections 
such as yellow and dengue fevers — brings 
with it the possibility of new research and 
control methods. 

Eric Marois of France’s National Centre 
for Scientific Research in Strasbourg, is 
part of a team working with transcription 
activator-like effector nucleases (TALENs) 
to disrupt the gene 7EP1, which is known to 
help Anopheles gambiae to resist infection 
by malaria parasites. Without the protection 
conferred by this gene, Marois’s team found 
that the mosquito from sub-Saharan Africa 
became hypersusceptible to parasites’. 
That may not sound like an advance, but the 
research is helping scientists to understand 
the genetics that make this particular 
species such a good vector, and may lead to 
better malaria control, with or without gene 
editing. 

Research with Aedes, which is easier 
to work with in the lab, is more advanced. 

A few groups have applied zinc-finger 
nucleases (ZFNs) and TALENs to the genus, 
but Ben Matthews, a mosquito specialist at 
Rockefeller University, New York, is trying 
out CRISPR-Cas9 because it is the cheapest 
and most user-friendly of the tools. Using 
the relatively simple technique also means 
his recent proof-of-concept paper is more 
likely to be picked up by other infectious- 
disease researchers. In the paper’, 
Matthews and his colleagues demonstrated 
the use of CRISPR-Cas9 to delete parts of a 
target gene, which created mutations that 
were passed on in the Aedes germ line, and 
to insert a whole gene ata specific location. 

But that is all in the laboratory. Getting 
insects with edited genomes to thrive in 


can delete a gene very precisely,’ says Ronald. 
The speed comes from the technologies’ abil- 
ity to remake an existing gene in the image of 
a more useful one, which might be present in 
the breeding population at very low frequency. 
Useful traits that are found only in wild popu- 
lations or related species — perhaps a species 
that encounters similar pathogens — can be 
quickly brought in. “Genome editing basically 


the wild — so that the edited genes spread 
throughout a population — presents an 
entirely different challenge. Researchers 
have to pick their gene edits carefully, 
because experiments show that seemingly 
advantageous genetic manipulation can 
reduce a mosquito’s ability to survive 

and reproduce compared with its wild 
counterparts. Another problem is that if an 
edit succeeds in making an insect immune 
to infection, it also creates a strong selective 
pressure for the pathogen to evolve a 
means of getting around the modification, 
potentially encouraging new and greater 
challenges to disease control. 

To circumvent some of these problems, 
scientists have proposed tricks, collectively 
known as gene drives, that artificially force 
the dissemination of gene modifications 
through the generations. During normal 
inheritance, there is a 50% chance that 
offspring will inherit a modified gene 
carried on one chromosome. The gene- 
drive system, however, cuts the partner to 
this chromosome and, during the repair 
process, the mutation is copied to the 
partner chromosome so that an edited 
organism will transmit the altered gene to 
almost all of its offspring. In 2011, a team 
led by scientists at Imperial College London 
showed that genetic elements known as 
homing endonucleases could work as gene 
drives in Anopheles®. And earlier this year, 
researchers at the University of California, 
San Diego, used CRISPR-Cas9 to generate 
a ‘mutagenic chain reaction’ whereby a 
mutation that is present in just one of a pair 
of chromosomes copies itself to the other 
chromosome of the pair’. 

Yet many researchers worry about the 
potential ecological affects of unleashing 
gene drives in the wild. As much as 
these modifications have the potential to 
eliminate the proliferation of insects that 
transmit disease to humans, they could 
also accidentally destroy a key segment of a 
food web, facilitating the invasion of another 
species. How to test gene drives properly 
without losing control of them is a catch-22 
situation. C.A. 


provides the variation you want, where you 
want it,” says Bruce Whitelaw, an animal bio- 
technologist at Scotland’s Roslin Institute, near 
Edinburgh. 

Inabarn at the Roslin Institute, pigs snuffle 
around, unaware that they illustrate Whitelaw’s 
point perfectly. As fertilized eggs, they had one 
of their immune-system genes edited. The 
gene in question, RELA, is thought to trigger 
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the overblown immune reaction that kills pigs 
infected with the haemorrhagic virus that 
causes African swine fever. Whitelaw’s team 
was inspired by the fact that warthogs (which 
belong to the same family as domestic pigs) 
tolerate the infection well, even though their 
version of RELA differs from that of domes- 
tic pigs by only 3 amino acids out of more 
than 500. Whitelaw’s team began the research 
using editing tools called zinc-finger nucleases 
and then transcription activator-like effector 
nuclease (TALEN) technology, and has since 
moved on to CRISPR-Cas9, with the aim of 
editing the pig gene to achieve the exact wart- 
hog RELA sequence. The edited pigs will soon 
be exposed to the pathogen, for which there is 
no vaccine or cure. If the pigs make it through 
unharmed, the team will have found a way to 
protect farmers from devastating losses, par- 
ticularly those in regions where the disease is 
hard to eradicate, such as sub-Saharan Africa 
and Eastern Europe. 

Whitelaw’s pig project will largely benefit 
poor farmers — a rarity for editing research. 
The prospect of tough regulation and conse- 
quently an expensive market-approval process 
has meant that a much more common goal 
among livestock-focused genome editing has 
been to generate higher-profit cattle, pigs and 
sheep with increased muscle mass — often 
by disabling the MSTN gene, which restricts 
muscle growth. 

Similarly, it is of little surprise that the first 
genome-edited crop to emerge — Cibus’s oil- 
seed rape — has a business rationale. Instead 
of focusing on an edit that could, for example, 
boost the vitamin content of the plant’s oil to 
combat malnutrition, the edits allow a farmer 
to spray weedkiller more liberally over his or 
her fields. “I don’t think it’s too extreme to say 
that the way that the technology will be used 
for plant breeding in the future will hinge on 
how is regulated,” says Jones. 

The question of how to regulate genome- 
edited crops in Europe has been on the table 
for years; the European Commission started to 
look at the issue back in 2007. The commission 
generally considers an organism to be GM if 
its genes are altered in ways that cannot occur 
naturally, suggesting that edited crops should 
be classified as GM. But it also has a record for 
making exceptions for crops in which muta- 
tions have been induced using chemicals or 
radiation. Jones sorely hopes that genome 
editing falls into the latter category. Placing it 
alongside older genetic engineering would, in 
his eyes, be unfair. “It’s almost like comparing 
chalk and cheese, he says. = 


Claire Ainsworth is a science journalist based 
in Hampshire, UK. 
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4 BIG QUESTIONS 


How much can we 
reduce the off-target 
effects of genome 
editing? 


Unintentional edits can occur 
where a similar or identical target 


_ DNAsequence appears elsewhere 
*, inthe genome. These off-target 


edits can frustrate the use of 
genome editing as a lab tool, 
and may cause side effects if the 
technique is used as a therapy. 


The more diseases that can be 
addressed through genome 
editing, the greater the 


7 technology’s potential to relieve 


Which diseases are 
suitable targets for 
genome editing? 


* the disease burden. 


GENOME EDITING 


OUTLOOK 


Despite the popularity 
of genome-editing 
techniques, researchers 
are still grappling with 
the known unknowns 
of the technologies. 


Here are four of their 
most pressing questions. 


BY WILL TAUXE 


The frequency of off-target effects 


varies among the three genome- 
editing technologies. TALENs 


*, produce the fewest off-target 


edits because they use a longer 
stretch of target DNA than ZFNs 
or CRISPR-Cas9 (see page S4). 


Genome editing has had some 
success in combating HIV in 
people with the infection (see 
page S8), providing hope for 


2 those with other non-inherited 


diseases. Encouraging results 
have also been seen in models of 
certain monogenic diseases (See 
page S10). 


The specificity of CRISPR-Cas9 
can be increased by adjusting 
the guide RNA, which leads 


‘ Cas9 to its target, and Cas9’s 
structure. Bioinformatics can 


predict where off-target effects 
are most likely to occur and 
evaluate their consequences. 


To expand the range of diseases 
amenable to genome editing, 
researchers need better ways 
to deliver the technology to 


the right cells. CRISPR-Cas9 is 


too large to fit inside the vector 
adeno-associated virus. CRISPR 
systems in different bacteria 
may offer smaller alternatives. 


Can the phenotypic 
effects of genome 
editing be accurately 
predicted? 


For gene editing to be successful, 
researchers need to be able to 
determine the effect that making 
small changes to DNA, or to its 


| packaging, has on the chemical 


components and physical 
properties of cells. 


Several approved drugs (that 
do not edit the genome) treat 
conditions such as epilepsy and 
cancer by causing chemical 


* modifications to DNA that do not 
change the order of its bases, or by: 


altering DNA’s packaging. But no 


one knows which of the alterations 


lead to these outcomes. 


Researchers have modified 
genome-editing tools to 
make epigenetic changes. 
By investigating the changes 


* caused by precise edits, 


they hope to gain a better 
understanding of the role of 
epigenetics in gene expression, 
and hence phenotype. 


Should we edit the 
human germ line? 


Making heritable edits has the 
potential to prevent diseases 
from being passed down the 


generations. But ‘permanent’ 
changes are risky if we do not have : 


a full understanding of human 
gene expression. There is also the 
potential for misuse. 


Edits to the genomes of non- 
viable human embryos have 
established proof-of-principle, 


although there is a high failure 


rate. If viable embryos are edited, 
implanting them and bringing 
them to term is just a short step 
away. 


Scientists need to engage 
with governments and invite 
informed public discussion to 


7 draw up rigorous guidelines that 


govern research and Clinical 
procedure. Systems must then 
be put in place to ensure that 
these guidelines are followed. 


Will Tauxe is a science writer in Atlanta, Georgia. 
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Multiple mechanisms for CRISPR-Cas inhibition 


by anti-CRISPR proteins 


Joseph Bondy-Denomy't, Bianca Garcia’, Scott Strum’, Mingjian Du’, MaryClare F. Rollins®, Yurima Hidalgo-Reyes', 


Blake Wiedenheft*, Karen L. Maxwell* & Alan R. Davidson’? 


The battle for survival between bacteria and the viruses that infect 
them (phages) has led to the evolution of many bacterial defence 
systems and phage-encoded antagonists of these systems. Clustered 
regularly interspaced short palindromic repeats (CRISPR) and the 
CRISPR-associated (cas) genes comprise an adaptive immune system 
that is one of the most widespread means by which bacteria defend 
themselves against phages’ *. We identified the first examples of pro- 
teins produced by phages that inhibit a CRISPR-Cas system*. Here we 
performed biochemical and in vivo investigations of three of these 
anti-CRISPR proteins, and show that each inhibits CRISPR-Cas 
activity through a distinct mechanism. Two block the DNA-binding 
activity of the CRISPR-Cas complex, yet do this by interacting with 
different protein subunits, and using steric or non-steric modes of 
inhibition. The third anti-CRISPR protein operates by binding to the 
Cas3 helicase-nuclease and preventing its recruitment to the DNA- 
bound CRISPR-Cas complex. In vivo, this anti-CRISPR can convert 
the CRISPR-Cas system into a transcriptional repressor, providing 
the first example—to our knowledge—of modulation of CRISPR-Cas 
activity by a protein interactor. The diverse sequences and mechan- 
isms of action of these anti-CRISPR proteins imply an independent 
evolution, and foreshadow the existence of other means by which 
proteins may alter CRISPR-Cas function. 

CRISPR-Cas RNA-guided immune systems are widespread in pro- 
karyotes, and play a major part in microbial evolution”’. In these 
systems, CRISPR arrays are transcribed and processed to generate 
small CRISPR RNAs (crRNAs), which combine with Cas proteins 
to form crRNA-guided surveillance complexes*°. In the type I-F 
CRISPR-Cas system, the Csy4 protein is a CRISPR-specific endoribo- 
nuclease that binds to and cleaves each repeat sequence in the 
pre-crRNA®. Csy4 remains associated with the 3’ end of the mature 
60-nucleotide crRNA and then assembles with Csy1, Csy2 and Csy3 
proteins to form a 350 kilodalton (kDa) surveillance complex”®. This 
complex relies on a 32-nucleotide segment of the crRNA for comple- 
mentary base pairing to invading DNA sequences, known as proto- 
spacers. Binding of target DNA by the Csy complex leads to the 
recruitment of the nuclease-helicase protein Cas3 and subsequent 
phage genome degradation’. We previously identified five unique 
type I-F anti-CRISPR proteins*. Here we determine the mechanisms 
by which three of these proteins function. 

Three type I-F anti-CRISPRs, AcrF1 (11 kDa, encoded by gene 35 
from phage JBD30), AcrF2 (13 kDa, encoded by gene 30 from phage 
D3112), and AcrF3 (16 kDa, encoded by gene 35 from phage JBD5), 
could be expressed in Escherichia coli and purified to homogeneity. 
Using a previously described E. coli expression system’, we also puri- 
fied the 350 kDa Pseudomonas aeruginosa Csy complex, including a 
crRNA and the four Csy proteins. This complex was mixed in vitro 
with each purified anti-CRISPR protein, and fractionated by size- 
exclusion chromatography (SEC). AcrF1 and AcrF2 co-eluted with 
the Csy complex (Fig. 1a and Extended Data Fig. 1), indicating a direct 


interaction. AcrF3 did not co-elute with the Csy complex (Fig. 1b). The 
lack of AcrF3 binding to the Csy complex suggested that it might inhibit 
the CRISPR-Cas system by interacting with Cas3, the helicase—nuclease 
that is responsible for target DNA destruction after recognition by the 
Csy complex. Supporting this hypothesis, AcrF3 co-eluted with purified 
Cas3, while AcrF1 did not (Fig. 1c and Extended Data Fig. 2). These 
experiments demonstrate that each of the three tested anti-CRISPR 
proteins can bind to either the Csy complex or Cas3. 

The Csy complex recognizes foreign DNA targets through sequential 
recognition of a protospacer adjacent motif (PAM) and crRNA-guided 
base pairing to a target’. We performed electrophoretic mobility shift 
assays (EMSAs) to demonstrate that the interaction of AcrF1 and AcrF2 
with the Csy complex blocked its ability to bind a 50 base pair (bp) 
double-stranded DNA (dsDNA) target containing a PAM and a 
sequence identical to the crRNA spacer (Fig. 1d). We used isothermal 
titration calorimetry to show that these anti-CRISPRs also blocked bind- 
ing of the Csy complex to an 8-nucleotide single-stranded DNA (ssDNA) 
target complementary to the functionally crucial ‘seed’ region’? of the 
crRNA (Extended Data Fig. 3). AcrF3, which does not interact with the 
Csy complex, did not inhibit the DNA-binding activity of the Csy com- 
plex (Fig. 1d, lane 5, and Extended Data Fig. 3). 

To probe the potential role of AcrF3 in blocking Cas3 activity, we 
mixed purified Cas3 with the Csy complex and target DNA. In this 
instance, a supershifted species appeared in the EMSA gel that we pre- 
sumed comprised the Csy complex, DNA and Cas3 (Fig. 1d, lane 7; a 
reaction containing only Cas3 and DNA did not display this species, 
lane 6). Importantly, pre-incubation of Cas3 with AcrF3 prevented 
formation of the supershifted complex (Fig. 1d, lane 10), indicating that 
this anti-CRISPR blocks recruitment of Cas3 to the Csy-DNA complex. 
Pre-incubation of Cas3 with AcrF1 or AcrF2 did not have this effect 
(Fig. 1d, lanes 8, 9). Further corroborating the presence of Cas3 in the 
supershifted complex, the addition of ATP prevented formation of this 
species (Fig. 1d, lane 11) and destabilized a preformed complex (lane 13), 
probably owing to the activation of the ATP-dependent helicase activity 
of Cas3, as described for the type I-E CRISPR-Cas system””. 

To demonstrate that the described anti-CRISPR mechanisms oper- 
ate in vivo, we targeted the Csy complex to the promoter of the phz&M 
gene, which is required in P. aeruginosa for production of the blue- 
green pigment pyocyanin’’. Binding of the phzM promoter by a Csy 
complex in the absence of Cas3 activity was expected to repress tran- 
scription, as was previously observed for a type I-E CRISPR-Cas 
system’*’*. Consistent with this expectation, targeting of the phzM 
promoter in cells containing a prophage expressing acrF3 resulted in 
cultures with a complete lack of pigment production, similar to a strain 
lacking Cas3 (Fig. 2a; the somewhat higher pigment production in the 
Acas3 strain is probably due to reduced Csy function’’). By contrast, 
the expression of acrF1 and acrF2, which inhibit DNA binding by the 
Csy complex, resulted in blue-green cultures, as did expression of the 
phzM promoter targeting crRNA in cells lacking Csy3. Quantitative 
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Figure 1 | Anti-CRISPR proteins inhibit CRISPR-Cas function by directly 
interacting with the Csy complex or Cas3. a, b, Purified Csy complex was 
incubated with purified AcrF1 (a) or AcrF3 (b) and the mixture was 
fractionated by SEC. Fractions were analysed by SDS-polyacrylamide gel 
electrophoresis (SDS-PAGE) and are numbered according to their elution 
position (see Extended Data Fig. 1 for SEC of the Csy complex alone or with 
AcrF2). The purified Csy complex or anti-CRISPR (ACR) are shown in the 
second (Csy) and last (ACR) lanes, respectively. c, Purified Cas3 was incubated 
with (right) or without (left) AcrF3 and fractionated by SEC. The eluting 
fractions were analysed by SDS-PAGE as described earlier. The input (In) lanes 
show the protein mixture that was loaded onto the SEC column. MBP, 
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Figure 2 | Anti-CRISPR proteins interact with Cas proteins in vivo a, The 
phzM promoter was targeted by a plasmid-encoded crRNA in P. aeruginosa. 
The production of pyocyanin was quantified in different PA14 mutant 
backgrounds (Acas3, Acsy3 or AphzM) or during the expression of the 
indicated anti-CRISPR from a prophage. The amount of pyocyanin produced 
in the presence of a plasmid producing the crRNA is shown as a percentage of 
the same strain with the empty plasmid vector. An average of three 
independent experiments is shown with error bars representing the standard 
deviation (s.d.). Representative pictures of cultures are shown. The pyocyanin 
ratio for the AphzM mutant was derived by comparing it to the value for 

the Acsy3 mutant. The prophage expressing acrF3 also encoded another anti- 
CRISPR, the functional mechanism of which is not known. To bolster our 
conclusions pertaining to acrF3, we also tested a prophage that expresses an 
86% identical homologue of acrF3, designated acrF3H, and no other anti- 
CRISPR. WT, wild type. b, Lysates of phages expressing the indicated anti- 
CRISPR proteins were spotted in tenfold serial dilutions on bacterial lawns of 
wild-type P. aeruginosa PA14 (top) or the same strain bearing a plasmid that 
overexpresses the Csy subunits (bottom). These phages would be targeted by 
the CRISPR-Cas system in the absence of anti-CRISPR activity. 


ove 


Lane 12 3 45 6 7 8 910111213 
Csy-Acr Cas3-Acr 
premix premix 


maltose-binding protein. d, dsDNA binding by the Csy complex was assayed 
using an EMSA. Csy complex was present in all reactions except for lanes 

1 and 6. Other components added to each reaction are designated above the 
lanes. In the lanes coloured red and blue, the designated components were 
premixed before the addition of DNA. ATP was added to the Csy-DNA-Cas3 
reaction either before the addition of Cas3 (lanes 11, 12) or after (lane 13). 
The supershifted species resulting from Cas3 addition did not migrate into the 
gel upon prolonged electrophoresis, but it is dissociated by the addition of 
ATP (lane 13), demonstrating that the supershift is not caused by aggregated 
inactive protein. 


polymerase chain reaction with reverse transcription (RT-qPCR) 
experiments showed that these changes in pyocyanin production cor- 
related with reduced transcription of the phzM gene (Extended Data 
Fig. 4). These results demonstrate that the expression of acrF3 blocks 
Cas3 activity in vivo, causing the Csy complex to function as a tran- 
scriptional repressor. Further in vivo experiments showed that phages 
dependent on acrF1 and acrF2 for viability were markedly inhibited 
by overexpression of the Csy complex subunits (Fig. 2b). The elevated 
level of Csy proteins probably increases the number of active Csy 
complexes and/or binds and titrates out anti-CRISPR molecules, 
resulting in insufficient levels of anti-CRISPR proteins to support 
robust phage replication. Phages dependent on acrF3 were not affected 
under these conditions because this anti-CRISPR protein binds to 
Cas3, the level of which is unchanged (overexpression of Cas3 inhib- 
ited cell growth). Interestingly, Csy subunit overexpression also 
inhibited a phage expressing acrF4 (gene 37 from phage JBD26), an 
anti-CRISPR protein that could not be purified. In addition, express- 
ion of this anti-CRISPR in the transcriptional repression assay resulted 
in a blue-green culture (Fig. 2a). These complementary results imply 
that AcrF4 binds the Csy complex, which we have experimentally 
confirmed (Extended Data Fig. 5). We conclude that our in vivo 
experiments are able to distinguish the effects of anti-CRISPR proteins 
that inactivate the Csy complex from those that inhibit Cas3. 

AcrF1 and AcrF2 both prevent DNA binding by the Csy complex, but 
might achieve this outcome through different mechanisms. The Csy 
complex assembles with a Csy1-Csy2 heterodimer bound at the 5’ end 
of the crRNA and a Csy4 monomer bound to the 3’ end, with six Csy3 
subunits arrayed along the backbone of the spacer region in between 
(Fig. 3a)°*. By purifying the Csy1-Csy2 heterodimer on its own and 
mixing it with purified anti-CRISPR proteins, we found that it co-eluted 
with AcrF2 in SEC experiments, but not with AcrFl (Fig. 3b and 
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Figure 3 | AcrF1 and AcrF2 bind distinct Csy complex subunits. a, A 
schematic of the crRNA showing the repeat-derived regions of the crRNA 
(black) and the 32-nucleotide (nt) spacer region (red). The coloured circles 
represent the Csyl—4 subunits. b, c, Purified 6 His/MBP-tagged Csy1-Csy2 
heterodimer (b) or Csy3 (c) was fractionated by SEC in the presence (right) or 
absence (left) of the indicated anti-CRISPR proteins. The SEC fractions were 
analysed by SDS-PAGE. The ‘In’ lanes show the protein mixture that was 
loaded onto the SEC column and fractions are numbered. d, Purified Csy 
complexes with 16-, 32-, or 48-nucleotide crRNA spacer regions were bound 
to AcrF1 or AcrF2 and fractionated by SEC. The stoichiometry of the 

bound anti-CRISPR proteins was quantified through densitometry of the 
Coomassie blue stained gels. An average of three independent experiments is 
shown with error bars representing s.d. 


Extended Data Fig. 6a). By contrast, AcrF1, but not AcrF2, bound Csy3 
(Fig. 3c and Extended Data Fig. 6b). Csy3 eluted in monomeric and 
multimeric forms in SEC experiments, with AcrF1 binding predomi- 
nantly to the multimeric fraction (Fig. 3c). The presence of distinct 
binding sites for AcrF1 and AcrF2 on the intact Csy complex was corro- 
borated through competition experiments showing that both anti- 
CRISPR proteins could simultaneously bind the Csy complex and that 
the presence of one had no effect on the binding ability of the other 
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Figure 4 | Two anti-CRISPR proteins inhibit target recognition via unique 
mechanisms. a, EMSA experiments were used to assay binding of the Csy 
complex to three different ssDNA oligonucleotides (labelled A, B and C) that 
are complementary to different regions of the crRNA spacer as shown in the 
schematic (see Extended Data Fig. 9b). Where noted, the Csy complex was pre- 
incubated with the indicated anti-CRISPR. b, c, Apo-Csy complex (AC) or 
DNA-bound Csy complex (DC) was incubated with AcrF1 or AcrF2. b, This 
mixture was fractionated by SEC and fractions were visualized by SDS-PAGE. 
c, An EMSA experiment is shown with binding to dsDNA in the same 
experimental setup as in b. d, A model summarizing anti-CRISPR mechanisms. 
Arrows indicate the steps of the uninhibited CRISPR-Cas interference 
pathway. Numbers in the Csy complex indicate the Csy subunits. The lines with 
flat ends indicate the step in the CRISPR-Cas pathway blocked by each anti- 
CRISPR. The manner in which each anti-CRISPR binds to CRISPR-Cas 
components is also shown. AcrF1 makes the whole crRNA inaccessible while 
AcrF2 occludes the 5’ end. 


(Extended Data Fig. 6c). RNase A treatment of the Csy complex, which 
resulted in Csy4 dissociation, had no effect on the binding of either 
anti-CRISPR (Extended Data Fig. 7). Quantification of the co-eluted 
fractions of AcrF1 or AcrF2 with the Csy complex by protein gel electro- 
phoresis revealed the stoichiometry of AcrF1 to be 2.6 + 0.3 proteins per 
Csy complex, while AcrF2 was 0.8 + 0.1 (Extended Data Fig. 7c). To 
verify these stoichiometries, we created Csy complexes with shorter 
(16 nucleotides; Csy,, complex) and longer spacer regions (48 nucleo- 
tides; Csy4g complex). The purified Csy, complex contained fewer mole- 
cules of Csy3 (4 + 0.7) than wild type, and the Csy4g complex contained a 
proportionally greater number (9 + 0.8) (Fig. 3d and Extended Data 
Fig. 8). Concomitant with the altered number of Csy3 molecules in the 
Csy,6 and Csy4g complexes, we observed corresponding changes in the 
number of AcrF1 molecules bound, with the ratio of Csy3 to AcrF1 
remaining constant. These results imply that AcrF1 binds along the full 
length of the Csy3 ‘spine’ of the complex. Its binding sites are probably at 
the interaction interfaces of the Csy3 subunits, which would account for 
the 2:1 Csy3/AcrF1 stoichiometry and for AcrF1 binding to only the 
multimeric Csy3 fraction (Fig. 3c). In contrast to AcrF1, the number of 
AcrF2 molecules bound to the altered Csy complexes did not change as 
the number of Csy3 molecules increased or decreased, consistent with 
AcrF2 binding to the Csy1—Csy2 heterodimer. 

To define further the sites of action of the anti-CRISPR proteins on 
the Csy complex, we performed DNA-binding assays using ssDNA 
molecules complementary to subregions of the crRNA spacer. As 
shown in Fig. 4a, AcrF1 inhibited binding to all the ssDNA molecules 
tested. By contrast, AcrF2 prevented binding to a 24-nucleotide ssDNA 
molecule complementary to the 5’ end of the crRNA, including the 
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seed region, but did not inhibit binding to a 16-nucleotide ssDNA 
complementary to the 3’ end of the spacer. Binding to a 26-nucleotide 
ssDNA binding the 3’ end was only partially inhibited. These data 
suggest that AcrF2 inhibits DNA binding by sterically blocking the 
5’ end of the crRNA spacer through its interaction with Csy1-Csy2, 
which is expected to be bound to this region of the crRNA’”"*. Addition 
of AcrF2 to a Csy complex that had been pre-saturated with target 
DNA resulted in an approximately 60% decrease in the binding level 
of this anti-CRISPR, suggesting that AcrF2 and DNA compete for an 
overlapping binding interface (Fig. 4b and Extended Data Fig. 9a). 
Consistent with this result, addition of AcrF2 to a DNA-bound Csy 
complex resulted in appreciably decreased DNA binding as detected by 
EMSA (Fig. 4c). Parallel experiments performed with AcrF1 showed 
that the binding of AcrF1 to the Csy complex was not affected by prior 
binding to DNA (Fig. 4b). We conclude that the interaction of AcrF1 
with the full length of the spine of the complex formed by multiple Csy3 
molecules and the crRNA accounts for its ability to block binding to all 
dsDNA and ssDNA molecules tested. Furthermore, the ability of AcrF1 
and DNA to bind the Csy complex simultaneously suggests an allo- 
steric mechanism for the activity of this anti-CRISPR. Thus, the 
mechanisms of AcrFl and AcrF2 are distinct, using different Csy 
protein-binding partners, stoichiometry and DNA occlusion mechan- 
isms (that is, steric versus allosteric). 

We provide the first insight into the mechanisms by which proteins 
can inhibit a CRISPR-Cas system. The diverse and distinct mechan- 
isms discovered here (Fig. 4d) reflect the deep evolutionary roots of the 
virus—host arms race. Anti-CRISPR proteins, both known*”’ and yet to 
be discovered, will provide an extensive set of valuable tools both better 
to understand and to manipulate CRISPR-Cas systems. One example 
is our finding that AcrF3 converts the CRISPR-Cas system into a gene 
regulator by blocking Cas3 recruitment. Since CRISPR-Cas systems 
perform a variety of roles beyond destroying foreign DNA”, many 
important functions may be fulfilled by proteins that interact with 
CRISPR-Cas components and thus alter the activity of the system. 


Online Content Methods, along with any additional Extended Data display items 
and Source Data, are available in the online version of the paper; references unique 
to these sections appear only in the online paper. 
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METHODS 


Protein purification. All proteins were affinity purified using Ni-NTA beads 
(Qiagen) to isolate recombinant proteins bearing a terminal 6x His tag. Anti- 
CRISPR proteins were expressed from the p15TV-L vector (NCBI accession num- 
ber EF456736), which possesses a T7 promoter and an amino-terminal 6 x His tag. 
Constructs expressing Csy1-4 containing a 6X His tag on either Csy3 or Csy4 were 
co-expressed with a construct producing a crRNA as previously described’. 
Individual Cas proteins (Csy1-Csy2, Csy3, and Cas3) were expressed from 
pHMGWA (NCBI accession number EU680841), which also has a T7 promoter. 
The proteins in this vector were tagged with a maltose-binding protein and 6 His. 
Cultures of E. coli BL21 containing a plasmid expressing a protein of interest 
were grown to an optical density (OD¢00 nm) of 0.5 and then induced with 1 mM 
isopropyl-B-p-thiogalactoside (IPTG) for 3 h at 37 °C (anti-CRISPRs, Csy3) or for 
16h at room temperature (Csy complex, Csy1—Csy2, Cas3). Cells were collected 
by centrifugation at 5,000gfor 10min and resuspended in a binding buffer 
(20mM Tris, pH7.5, 250mMNaCl, 5mM imidazole, 1mM_ dithiothreitol 
(DTT) and 1mM PMSF). The cells were lysed by sonication and the resulting 
lysate was centrifuged at 15,000g for 15 min to remove cell debris. The supernatant 
was mixed with Ni-NTA beads that had been washed in binding buffer (without 
DTT) five times. Binding to the beads proceeded for 1h at 4°C under gentle 
rotation, at which point the lysate and beads were passed through a column, 
washed 3-5 times with binding buffer containing 30 mM imidazole and ultimately 
eluted in buffer containing 250mM imidazole. Colourimetric Bradford assays 
were conducted during the procedure to determine the number of washes to 
perform and elution fractions to collect. Purified protein was dialysed into the 
binding buffer containing 5 mM imidazole to remove excess imidazole and visua- 
lized on Coomassie blue R250 stained SDS-PAGE gels. Cas3 was purified follow- 
ing the same general protocol but in a buffer optimized for this protein 
(50 mM HEPES, pH 7.5, 500 mM NaCl, 5% glycerol, 1 mM DTT, supplemented 
with 1 mM PMSF and 150 1M NiSO, in the lysis buffer). Purified Cas3 was con- 
centrated and buffer exchanged in an Amicon Ultra centrifugal filter (Millipore) 
into a different buffer (20 mM HEPES, pH 7.5, 300 mM KCI, 5% glycerol, 1 mM 
DTT) for protein interaction assays. Csy1—Csy2 also purified in the same buffer as 
Cas3 (with NiSO, omitted). Purified Csy1-Csy2 was then dialysed into a different 
buffer (20 mM HEPES, pH 7.5, 250 mM NaCl, 5% glycerol, 1 mM DTT) for pro- 
tein interaction experiments. 
Size-exclusion chromatography. Affinity-purified proteins were fractionated by 
SEC using a GE Life Sciences Superdex 200 10/300 column. Fractions were col- 
lected in 0.5 ml volumes and monitored by optical density at 280 nm. SDS-PAGE 
gels were stained with silver nitrate or Coomassie blue R250 to identify proteins. In 
interaction experiments, purified proteins were mixed together before fractiona- 
tion by SEC and co-eluting proteins were identified by SDS-PAGE. The Csy 
complex or Csy proteins and an anti-CRISPR protein of interest were generally 
incubated together for 1h at 4°C. This mixture was then applied to the SEC 
column at room temperature. A fraction of the input (~0.5%) was also kept for 
SDS-PAGE analysis. 
Anti-CRISPR stoichiometry. The purified Csy complex was incubated with 
~10-fold molar excess of purified anti-CRISPR proteins. This mixture was frac- 
tionated by SEC as described earlier. The Csy complex peak fraction was run on 
SDS-PAGE gels in twofold serial dilutions. The protein bands were identified with 
Coomassie blue R250. Image Lab Software (Bio-Rad) was used to quantify band 
intensities and calculate the relative stoichiometries of the various subunits and 
anti-CRISPRs, after adjusting for molecular weight and comparing dilutions. Our 
estimates of the absolute stoichiometries of the Csy subunits is based on the 
stoichiometry of the Csy complex established in previous publications”*. 
RNase A treatment of the Csy complex. Pancreatic RNase A (73 UM) was used 
to treat the Csy complex (4 1M) for 30 min at 37°C. After digestion, the treated 
Csy complex was fractionated by SEC in the absence or presence of an anti- 
CRISPR protein. Fractions from SEC were analysed on Coomassie stained 
SDS-PAGE gels to visualize proteins and SYBR Gold stained TBE-Urea gels to 
visualize nucleic acid. 
Isothermal titration calorimetry. Purified Csy complex was added to the iso- 
thermal titration calorimetry (ITC) chamber at a concentration of 7.5 uM. The 
DNA ligand (8-nucleotide ssDNA) was placed in the injection syringe at a con- 
centration of 75 tM. After a null injection of 0.3 kl of titrant, 3 ul of titrant were 
injected 13 times, with 120s intervals between the injections to establish a baseline. 
The DNA titrant and Csy complex were in the same buffer (20 mM Tris, pH 7.5, 
250 mM NaCl, 5 mM imidazole) and the experiment was temperature controlled 
at 25°C. To assess the role of AcrF1 in interfering with the interaction between 
the Csy complex and a DNA target, the Csy complex was first incubated with a 
~10-fold molar excess of anti-CRISPR proteins for 1h at 4°C. This mixture was 
then applied to the chamber, the temperature equilibrated to 25 °C and the DNA 
titration performed. 


Electrophoretic mobility shift assay. A 50-nucleotide ssDNA molecule was syn- 
thesized (Eurofins Genomics) that contains 32 nucleotides of complementarity to 
the crRNA in the purified Csy complex. The DNA (200 nM) was phosphorylated 
ina T4 polynucleotide kinase reaction with [y-*’P]ATP. The reaction was stopped 
with 12mMEDTA and GE MicroSpin G-25 columns were used to remove 
remaining radiolabelled nucleotides. To generate dsDNA, the labelled strand 
was heated to 98 °C in the presence of a twofold excess of an unlabelled comple- 
mentary strand and allowed to return slowly to room temperature. Csy complex- 
DNA-binding reactions were conducted in a binding buffer (10 mM HEPES, pH 
7.5, 1mM MgCl, 20 mM KCI, 1 mM TCEP, bromophenol blue and 6% glycerol) 
at 37°C for 15 min. The concentration of the Csy complex used in EMSA experi- 
ments varied, depending on the oligonucleotide target being used. For 50 bp 
dsDNA EMSA reactions, 100 nM of the Csy complex was routinely used in reac- 
tions, with <1 nM labelled DNA. Anti-CRISPR proteins were used at a tenfold 
molar excess compared to the Csy complex and allowed to incubate with Apo-Csy 
complex or DNA-bound Csy complex for 1h. After the appropriate incubation, 
the reactions were resolved on native 6% polyacrylamide TBE gels. Gels were 
wrapped in Saran wrap and visualized with a phosphoscreen and Typhoon imager. 
Optimal exposures were ~2-3 h. 

For EMSA experiments involving Cas3, the Csy complex and target DNA were 
prebound as described above. 6X His-tagged Cas3 was purified by Ni-NTA chro- 
matography (6 His) followed by SEC, concentrated, transferred into the EMSA 
reaction buffer, flash frozen in small volumes (50 pl) and stored at —70 °C. Cas3 
was added to the EMSA reaction at a final concentration of 400 nM and incubated 
for 30 min at 37°C. ATP was added at a final concentration of 2mM< and all 
reactions with Cas3 also contained 100 uM CoCh. 

Pyocyanin repression. A crRNA was designed to target the promoter region of 
phzM, a gene required for the biosynthesis of the blue-green pigment pyocyanin. 
Two complementary oligonucleotides were synthesized containing two 28 bp 
PA14 CRISPR repeat sequences, flanking a 32 bp sequence with perfect comple- 
mentarity to the —35/—10 region of the phzM promoter (position 813576-813607 
in the PA14 genome). The spacer was designed to produce a crRNA that would 
bind to the non-template strand, in a position where the protospacer adjacent motif 
(GG) is present. The oligonucleotides were annealed and cloned into an arabinose 
inducible P. aeruginosa expression vector, PHERD30T. This construct was then 
used to transform PA14 strains possessing single cas gene knockouts or wild-type 
PA14 possessing prophages expressing various anti-CRISPRs. Individual transfor- 
mants were grown overnight (~20h) in 2 ml of King’s A media in 501g ml? 
gentamicin and 0.025% arabinose, to induce expression of the crRNA. Pyocyanin 
was extracted with an equal volume of chloroform, and then mixed with 1 ml of 
0.2 M HCL, producing a pink-red colour proportional to the amount of pyocyanin, 
which was quantitated by measuring absorbance at 520 nm. Anti-CRISPR proteins 
were expressed from the following prophages: JBD30 (AcrF1), D3112 (AcrF2), 
JBD26 (AcrF4), JBD5 (AcrF3 and AcrF5), and JBD88a (AcrF3H). Since phage 
JBD5 contains two type I-F anti-CRISPR proteins, phage JBD88a (possessing a 
homologue of AcrF3 with 86% protein sequence identity) was also used. 
Competition experiments. To determine whether the two anti-CRISPR proteins 
that bind to the Csy complex compete with each other for the same binding site, 
the first anti-CRISPR was added for 1 h at 4°C and then the second for the same 
amount of time. This entire mixture was then fractionated by SEC. 

To determine whether DNA and anti-CRISPR proteins compete for the same 
binding site, the purified Csy complex (4.5 1M) was mixed with a 50 bp dsDNA 
target (10 1M) and incubated for 15 min at 37 °C in the same buffer in which the 
proteins were purified (20 mM Tris, pH 7.5, 250 mM NaCl, 5 mM imidazole). This 
DNA-bound Csy complex was then mixed with a tenfold molar excess of AcrF1, 
AcrF2, or an equivalent volume of buffer and incubated for 1h at 4°C. This 
mixture was fractionated by SEC. The fraction containing the Csy complex was 
analysed on Coomassie blue stained SDS-PAGE gels or SYBR Gold stained TBE- 
Urea gels. 

Plaque assays with Csy subunit overexpression. To assess the consequence of 
Csy protein overexpression on phages possessing distinct anti-CRISPR proteins 
in vivo, apHERD30T derived plasmid expressing the csy1, csy2, csy3 and csy4 genes 
was used to transform P. aeruginosa strain PA14. Phage lysates were spotted in 
tenfold serial dilutions onto a lawn of PA14 containing empty vector, or the 
plasmid expressing the csy genes. Phages JBD30, JBD26, D3112 and JBD88a all 
have protospacers that display 100% identity to spacers 17 and 20 in the PA14 
CRISPR2 locus*. JBD5 has a protospacer matching CRISPR2 spacer 1 that has been 
shown to be targeted*?'. 

RT-qPCR. RT-qPCR reactions were conducted as described previously*. Briefly, 
total RNA was extracted and DNase treated. One nanogram of total RNA was 
subjected to a reverse transcription reaction and qPCR, using primers specific 
to phzM or a control, rpsL. The efficient removal of DNA from the RNA 
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preparation was confirmed by including controls for each sample without reverse 
transcriptase added. 

AcrF2 misannotation. The D3112 phage genome has an annotated open reading 
frame identified as gene 30, which is a predicted 90 amino acid protein (NCBI 
accession number NC_005178). This version of the gene was previously identified 
as an anti-CRISPR, although overexpression from a plasmid was required for 
activity*, A nucleotide alignment of the anti-CRISPR region of many phages 
revealed that all phage anti-CRISPR operons possess a start codon (ATG) at the 
same position for the first anti-CRISPR gene, except phage D3112. Phages D3112 
and MP29 (which has a D3112 gene 30 homologue), had the start position anno- 
tated downstream of this commonly used ATG, at a second ATG, in frame with 
the first, resulting in a putative truncation of six amino acid residues. Re-cloning of 
the gene to include these six residues resulted in a construct that had full anti- 
CRISPR activity in the absence of overexpression. Thus, this 96-residue protein 
(sequence shown later, with new residues in bold) is the version that was used in all 
downstream experiments presented here and in affinity purification, after addition 
of the appropriate tag. All other anti-CRISPR protein sequences are as reported in 
ref. 4. AcrF2: MTKTAQMIAQQHKDTVAACEAAEAIAIAKDQVWDGEGYT 


LETTER 


KYTFDDNSVLIQSGTTQYAMDADDADSIKGYADWLDDEARSAEASEIER 
LLESVEEE. 

Statistics, reagents and data deposition. To assess interactions between anti- 
CRISPR proteins and the Csy complex or purified Cas proteins, mixed compo- 
nents were fractionated by SEC. Each result shown in the manuscript was obtained 
on at least two independent occasions. ITC, EMSA and plaque assays were all 
replicated at least three times. No statistical methods were used to predetermine 
sample size. The experiments were not randomized. The investigators were not 
blinded to allocation during experiments and outcome assessment. 

The sequences of the anti-CRISPR proteins are present in ref. 4, with full genomes 
for phages JBD30, D3112, JBD5, JBD26 and JBD88a available on NCBI (accession 
numbers: NC_020198, NC_005178, NC_020202, JN811560 and NC_020200, 
respectively). 


21. Cady, K. C., Bondy-Denomy, J., Heussler, G. E., Davidson, A. R. & O'Toole, G. A. The 
CRISPR/Cas adaptive immune system of Pseudomonas aeruginosa mediates 
resistance to naturally occurring and engineered phages. J. Bacteriol. 194, 
5728-5738 (2012). 
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Extended Data Figure 1 | AcrF2 interacts with the Csy complex. a, b, Purified Csy complex was fractionated by SEC alone (a) or in the presence of AcrF2 
(b). Fractions were analysed on a silver nitrate stained SDS-PAGE gel. The input (IN) and fractions are shown. 
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Extended Data Figure 2 | AcrF3, not AcrF1, interacts with Cas3. a, Cas3 
was fractionated by SEC alone or in the presence of AcrF3 or AcrF1. Overlays 
of plots of elution volume versus optical density at 280 nm of the column 
eluates are shown. The numbers represent the fractions that were selected for 
analysis. b-e, Silver nitrate stained SDS-PAGE gels are shown from SEC 
experiments with Cas3 (b), AcrF3 (c), Cas3 with AcrF3 (d) or Cas3 with AcrF1 (e). 
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The sample that was loaded onto the SEC column is shown as input (In) and 
fractions from the same elution positions are indicated numerically. AcrF3 is 
seen eluting in fractions 4-8 only in the presence of Cas3. There is also a 
visible shift in the Cas3 elution profile in the presence of AcrF3 but not AcrF1 
(fractions 3-5). 
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Extended Data Figure 3 | AcrF1 and AcrF2 prevent target recognition by 
the Csy complex. Isothermal titration calorimetry (ITC) assays showing the 
Csy complex binding to an 8-nucleotide ssDNA target that comprises the 
seed region. No binding is observed in the presence of AcrF1, AcrF2 or with a 
non-target (the reverse complement sequence of the target) ssDNA substrate. 


AcrF2 


AcrF3 - 


A representative run is shown for each condition with the dissociation constant 
(Kg) value and error of fit from that particular run. Over multiple runs (n = 6) 
with the Csy complex binding to the ssDNA ligand, the average Kg value 
was 90nM = 37. 
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Extended Data Figure 4 | Expression of phzM is repressed by the Csy 
complex. The Csy complex was targeted to the promoter of the gene phzM, and 
repression efficiency was assayed by RT-qPCR (see Methods). The per cent 
repression of phzM in the indicated strains expressing a phzM-targeting crRNA 
relative to wild-type (WT) PA14 with an empty plasmid is shown. All values 
were normalized to rpsL, a gene encoding a ribosomal protein. Means + s.d. 
are shown. 
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Extended Data Figure 5 | AcrF4 interacts with the Csy complex. Untagged complex. b, The Ni-NTA elution fractions were fractionated by SEC, 

AcrF4 was expressed in E. coli BL21 cells and a crude lysate of these cells was | demonstrating a stable interaction between the Csy complex and AcrF4. The 
mixed with the Csy complex bound to Ni-NTA beads via a 6X His tag on input (In) lane shows the sample that was loaded on the SEC column and 
Csy3. a, The flow through (FT), wash 1 (W1), and two elution fractions (E1,E2) numbered fractions are analysed on SDS-PAGE gels. 

from the Ni-NTA column are shown, as well as a comparison to pure Csy 
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Extended Data Figure 6 | AcrF1 and AcrF2 bind the Csy complex at distinct with the Csy complex singly or in combination. Asterisks designate which anti- 
locations. a, Purified Csy1-Csy2 heterodimer with an MBP and 6x His tag CRISPR was added first to the reactions containing both anti-CRISPR 

fused to Csy1 was fractionated by SEC in the presence or absence of AcrF1 proteins. The addition order did not affect the result since there is no 

(boxes indicate the Csy1—-Csy2 peak). b, Purified MBP/6X His-tagged Csy3 was _ competition for binding sites between these two anti-CRISPR proteins. After 
fractionated in the presence or absence of AcrF2. These are complementary incubation, each mixture was fractionated by SEC and the peak Csy 
experiments to those seen in Fig. 3b and c, respectively. Input (In) and selected complex fraction is shown on an SDS-PAGE gel. In each experiment the 
fractions are shown on SDS-PAGE gels. c, AcrF1 and AcrF2 were incubated —_ anti-CRISPR proteins are in excess relative to the Csy complex. 
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Extended Data Figure 7 | AcrF1 and AcrF2 interact with an RNase-A- 
treated Csy complex. a, The Csy complex was treated with a low concentration 
(600 nM, +) of RNase A or a high concentration of RNase A (70 uM, ++). 
This mixture was fractionated by SEC, revealing Csy4 dissociation at the higher 
RNase A concentration. Pre-treatment of the Csy complex with RNase A, 
with the subsequent addition of AcrF1 or AcrF2 followed by SEC fractionation 
was then conducted. Peak Csy complex fractions are shown on an SDS-PAGE 
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gel. b, A TBE-urea denaturing gel is shown, stained with SYBR gold, showing 
the native crRNA in the Csy complex and the protected fragments remaining 
after 70 LM RNase A treatment. c, Quantification of Coomassie blue 

stained gels from three independent preparations of the respective proteins is 
shown. Anti-CRISPR proteins bound with unaltered stoichiometry to RNase- 
A-pre-treated Csy complexes. Error bars represent s.d. 
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Extended Data Figure 8 | Twofold dilutions used to quantify anti-CRISPR —_ containing the Csy complex and co-eluting AcrF1. Arrows on the bottom of the 
binding stoichiometry. Csy complexes with crRNA molecules possessing gel indicate comparable dilutions based on the levels of Csy1. Note the 
spacers of differing lengths (16, 32, or 48 nucleotides) were purified and increasing abundance of Csy3 and AcrF1. b, Lanes with arrows from the gel in 


fractionated by SEC in the presence of AcrF1. A representative Coomassie blue _a are shown next to each other for comparison. 
stained SDS-PAGE gel is shown, with twofold dilutions of the peak fraction 
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Extended Data Figure 9 | dsDNA binds to the Csy complex after SEC 
fractionation. a, The same samples from Fig. 4a were run on a denaturing 
TBE-urea gel, stained with SYBR gold, to reveal the crRNA (two species are 
apparent), and the Csy-complex-bound 50 bp dsDNA. In these experiments, 
DNA was prebound to the Csy complex, and AcrF1 or AcrF2 were 
subsequently added to the DNA-saturated Csy complex. This mixture was then 
fractionated by SEC and the Csy-complex-containing peak fractions were 


analysed. b, A schematic showing the crRNA sequence with repeat-derived 
regions shown in black and the variable 32-nucleotide spacer region in red. The 
seed-interacting region that is critical for target recognition (nucleotides 1-5, 7, 
8) is in bold. DNA oligonucleotides used in this study are shown, with labels ‘A’, 
‘B and ‘C’ corresponding to the targets shown in Fig. 4c. The 8-nucleotide 
ssDNA substrate was used in ITC experiments (Extended Data Fig. 3), and the 
50 bp dsDNA in EMSAs (Figs 1d and 4b). 
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DNA-free genome editing in 
plants with preassembled 
CRISPR-Cas9 ribonucleoproteins 


Je Wook Woo!’, Jungeun Kim?*7, Soon Il Kwon!, 
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Sang-Gyu Kim?, Sang-Tae Kim?, Sunghwa Choe!*> & 
Jin-Soo Kim? 


Editing plant genomes without introducing foreign DNA 

into cells may alleviate regulatory concerns related to 
genetically modified plants. We transfected preassembled 
complexes of purified Cas9 protein and guide RNA into plant 
protoplasts of Arabidopsis thaliana, tobacco, lettuce and rice 
and achieved targeted mutagenesis in regenerated plants 

at frequencies of up to 46%. The targeted sites contained 
germline-transmissible small insertions or deletions that are 
indistinguishable from naturally occurring genetic variation. 


Programmable nucleases, such as zinc-finger nucleases (ZFNs), 
transcription activator-like effector nucleases (TALENs) and RNA- 
guided endonucleases (RGENs), have been used for genome editing 
in multiple cells and species including plants!-3, paving the way for 
novel applications in biomedical research, medicine and biotechnol- 
ogy*. CRISPR RGENs are rapidly superseding ZFNs and TALENs 
owing to their ease of use; RGENs that consist of the Cas9 protein 
derived from Streptococcus pyogenes and guide RNAs (gRNAs) can 
be customized by replacing only the RNA component, sidestepping 
the labor-intensive and time-consuming protein engineering needed 
to customize TALENs and ZFNs. Programmable nucleases, delivered 
into plant cells either by using Agrobacterium tumefaciens or by trans- 
fecting plasmids that encode them, cleave chromosomal target sites 
in a sequence-dependent manner, producing site-specific DNA 
double-strand breaks (DSBs). The repair of these DSBs by endogenous 
systems results in targeted genome modifications. 

It remains unclear whether genome-edited plants will be regulated 
by genetically modified organism (GMO) legislation in the EU and 
other regions°. Programmable nucleases induce small insertions and 
deletions (indels) or substitutions at chromosomal target sites that are 
indistinguishable from naturally occurring genetic variation. However, 
mutated plants might be considered GMOs by regulatory authorities in 
certain countries, which could reduce the potentially widespread use 
of programmable nucleases in plant biotechnology and agriculture. 


For example, when A. tumefaciens is used as a delivery vector, the 
resulting genome-edited plants contain foreign DNA sequences, 
including those that encode the programmable nucleases, in the host 
genome. Removal of these A. tumefaciens-derived DNA sequences 
by breeding is not feasible in species such as grape, potato and banana 
that reproduce asexually. 

Non-integrating plasmids could be transfected into plant cells to 
deliver programmable nucleases. However, transfected plasmids are 
degraded in cells by endogenous nucleases, and the resulting small 
DNA fragments are sometimes inserted at both on-target and off- 
target sites in host cells®; therefore, this approach might be unsuitable 
in plants if regulatory approval is required. 

Delivery of preassembled Cas9 protein-gRNA ribonucleoproteins 
(RNPs), rather than plasmids that encode these components, into 
plant cells could remove the likelihood of inserting recombinant DNA 
in the host genome’. Furthermore, as has been shown in cultured 
human cells®, RGEN RNPs cleave chromosomal target sites imme- 
diately after transfection and are degraded rapidly by endogenous 
proteases in cells, which might reduce the frequency of mosaicism 
and off-target effects in regenerated whole plants. Because there is no 
need to optimize codon usage or find promoters that will express Cas9 
and gRNAs when using protein-and-RNA-only systems, the use of 
preassembled RGEN RNPs could broaden the applicability of genome 
editing to all transformable plant species. In addition, using RGEN 
RNPs enables in vitro prescreening to guide the choice of highly active 
gRNAs® and genotyping of mutant clones via restriction fragment 
length polymorphism (RFLP) analysis’. To the best of our knowledge, 
RGEN RNPs have not been used in any plant species. 

Here we report the delivery of RGEN RNPs into protoplasts of 
various plant species and the induction of targeted genome modifica- 
tions in whole plants regenerated from them. Purified Cas9 protein 
was mixed with a two- to tenfold molar excess of gRNAs targeting 
four genes from three plant species in vitro to form preassembled 
RNPs. The RGEN RNPs were incubated with protoplasts derived from 
A. thaliana, tobacco (Nicotiana attenuata) and rice (Oryza sativa) 
in the presence of polyethylene glycol (PEG). We used both the T7 
endonuclease I (T7E1)(ref. 10) assay and targeted deep sequencing to 
measure mutation frequencies in transfected cells (Fig. 1a,b). Indels 
were detected at the expected positions, that is, 3 nucleotides (nt) 
upstream of an NGG protospacer-adjacent motif (PAM), with 
frequencies that ranged from 8.4% to 44% (Fig. 1a). 

We also co-transfected two gRNAs whose target sites were separated 
by 201 base pairs (bp) in the BRASSINOSTEROID INSENSITIVE 1 
(BRI1) gene in Arabidopsis to investigate whether the repair of two 
concurrent DSBs would result in targeted deletion of the intervening 
sequence, as has been seen in human cells!!. Sanger sequencing showed 
that a 223-bp DNA sequence was deleted in protoplasts (Fig. 1c). 
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Figure 1 RGEN RNP-mediated gene disruption 
in plant protoplasts of Nicotiana attenuata, 
Arabidopsis thaliana and Oryza sativa. 
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that RGENs cut target sites immediately after 
transfection and induce mutation before a 
full cycle of cell division is completed. 

Next, we investigated whether RGEN RNPs induce off-target 
mutations at sites highly homologous to on-target sites. We searched 
for potential off-target sites of the PHYTOCHROME B (PHYB) and 
BRI1 gene-specific sgRNAs in the Arabidopsis genome using the Cas- 
OFFinder program!” and used targeted deep sequencing to meas- 
ure mutation frequencies (Supplementary Fig. 1). Indels were not 
detected at any sites that differed from on-target sites by 2-5 nt, in 
line with previous findings in human cells!*-1*. 

Finally, we transfected an RGEN RNP to disrupt the lettuce 
(Lactuca sativa) homolog of the A. thaliana BRASSINOSTEROID 
INSENSITIVE 2 (BIN2) gene (Supplementary Fig. 2), which encodes 
a negative regulator in a brassinosteroid (BR) signaling pathway, 
into lettuce protoplasts, and obtained microcalli regenerated from the 


RNP-transfected cells (Fig. 2 and Supplementary Fig. 3). We used 
the same RGEN RNP inan RFLP analysis to genotype the lettuce micro- 
calli. Unlike the T7E1 assay, this analysis distinguishes monoallelic 
mutant clones (50% cleavage) from heterozygous biallelic mutant clones 
(no cleavage) and homozygous biallelic mutant clones (no cleavage) 
from wild-type clones (100% cleavage)?. Furthermore, the RGEN- 
RELP assay is not limited by sequence polymorphisms near the nucle- 
ase target site that may exist in the lettuce genome. This assay showed 
that 2 of 35 (5.7%) calli contained monoallelic mutations and 14 of 35 
(40%) calli contained biallelic mutations at the target site (Fig. 2b), 
demonstrating that RGEN-induced mutations were maintained after 
regeneration. The overall mutation frequency in lettuce calli was 46%. 
We used targeted deep sequencing to confirm genotypes in the 16 
mutant calli. The number of base pairs deleted or inserted at the target 
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Figure 2 Targeted gene knockout in lettuce using an RGEN RNP. (a) The target sequence in the B/N2 gene. The PAM sequence is shown in red. (b) Genotyping 
of microcalli. Top, RGEN-RFLP analysis. Bottom, mutant DNA sequences in microcalli. (¢) Whole plants regenerated from RGEN RNP-transfected protoplasts. 

Scale bars, 10 cm. (d) T1 plantlets obtained from a homozygous biallelic mutant termed TO-12. Scale bars, 1 cm. (e) RGEN-RFLP analysis for genotyping 

T1 plantlets. (f) DNA sequences of the wild type, the TO-12 mutant, and T1 mutants derived from the TO-12 line. Red triangles indicate an inserted nucleotide. 
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site ranged from —9 to +1, consistent with mutagenic patterns 
observed in human cells!4. No apparent mosaicism was detected in 
these clones (Supplementary Fig. 4), suggesting that the RGEN RNP 
cleaved the target site immediately after transfection and induced 
indels before cell division was completed. 

We next evaluated whether the BIN2-specific RGEN induced 
off target mutations in the lettuce genome using high-throughput 
sequencing. No off-target mutations were detected at 91 homologous 
sites that differed by 1-5 nucleotides from the on-target site in three 
BIN2-mutated plantlets (Supplementary Tables 1 and 2), consistent 
with our findings in human cells that off-target mutations induced by 
CRISPR RGENS are rarely found in a single cell-derived clone!®. 

Subsequently, whole plants were successfully regenerated from these 
genome-edited calli and grown in soil (Fig. 2c and Supplementary 
Fig. 5). Seeds were obtained and germinated from a fully grown 
homozygous biallelic mutant. As expected, the mutant allele was 
transmitted to the next generation (Fig. 2d-f). Further studies are 
needed to test whether the BIN2-disrupted lettuce has the predicted 
phenotype of increased BR signaling. 

In conclusion, we have shown that RGEN RNPs can be used to 
induce targeted genome modifications in six genes in four different 
plant species. RGEN-induced mutations were stably maintained in 
whole plants that were regenerated from the protoplasts and were 
transmitted to the germline. Because no recombinant DNA is used in 
this process, the resulting genome-edited plants might be exempt from 
current GMO regulations!”, paving the way for the widespread use of 
RNA-guided genome editing in plant biotechnology and agriculture. 
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ONLINE METHODS 

Cas9 protein and guide RNAs. Cas9 protein tagged with a nuclear localization 
signal was purchased from ToolGen, Inc. (South Korea). Templates for 
guide RNA transcription were generated by oligo-extension using Phusion 
polymerase (Supplementary Table 3). Guide RNAs were in vitro transcribed 
through runoff reactions using the T7 RNA polymerase (New England 
BioLabs) according to the manufacturer’s protocol. The reaction mixture was 
treated with DNase I (New England BioLabs) in 1x DNase I reaction buffer. 
Transcribed sgRNAs were resolved on an 8% denaturing urea-polyacrylamide 
gel with SYBR gold staining (Invitrogen) for quality control. Transcribed 
sgRNAs were purified with MG PCR Product Purification SV (Macrogen) 
and quantified by spectrometry. 


Protoplast culture. Protoplasts were isolated as previously described from 
Arabidopsis!8, rice!? and lettuce”°. Initially, Arabidopsis (Arabidopsis thaliana) 
ecotype Columbia-0, rice (Oryza sativa L.) cv. Dongjin, and lettuce (Lactuca 
sativa L.) cv Cheongchima seeds were sterilized in a 70% ethanol, 0.4% 
hypochlorite solution for 15 min, washed three times in distilled water, and 
sown on 0.5x Murashige and Skoog solid medium supplemented with 2% 
sucrose. The seedlings were grown under a 16 h light (150 [mol m~? s~!) and 
8 h dark cycle at 25 °C in a growth room. For protoplast isolation, the leaves 
of 14 d Arabidopsis seedlings, the stem and sheath of 14 d rice seedlings, and 
the cotyledons of 7 d lettuce seedlings were digested with enzyme solution 
(1.0% cellulase R10, 0.5% macerozyme R10, 0.45 M mannitol, 20 mM MES 
[pH 5.7], CPW solution?!) during incubation with shaking (40 r.p.m.) for 12h 
at 25 °C in darkness and then diluted with an equal volume of W5 solution. 
The mixture was filtered before protoplasts were collected by centrifugation 
at 100g in a round-bottomed tube for 5 min. Re-suspended protoplasts were 
purified by floating on a CPW 21S (21% [w/v] sucrose in CPW solution, 
pH 5.8) followed by centrifugation at 80g for 7 min. The purified 
protoplasts were washed with W5 solution and pelleted by centrifugation at 
70g for 5 min. Finally, protoplasts were re-suspended in W5 solution and 
counted under the microscope using a hemocytometer. Protoplasts were 
diluted to a density of 1 x 10° protoplasts/ml of MMG solution (0.4 M mannitol, 
15 mM MgCl, 4 mM MES [pH 5.7]). In the case of tobacco protoplasts, 
3-week-old Nicotiana attenuata leaves grown in B5 media were digested 
with enzymes (1% cellulose R10, 0.25% macerozyme R10, 0.5 M Mannitol, 
8 mM CaCl,, 5 mM MES [pH 5.7], 0.1% BSA) for 5 h at 25 °C in darkness. 
Subsequently, protoplasts were washed with an equal volume of W5 solution 
twice. To obtain intact protoplasts, N. attenuata protoplasts in W5 solution 
were applied to an equal volume of 21% sucrose gradient followed by swing- 
out centrifugation at 50g for 5 min. The intact protoplasts were re-suspended 
in W5 solution and stabilized at least for 1 h at 4 °C before PEG-mediated 
transfection. 


Protoplast transfection. PEG-mediated RNP transfections were performed 
as previously described!®. Briefly, to introduce DSBs using an RNP complex, 
1 x 10° protoplast cells were transfected with Cas9 protein (10-60 [1g) 
premixed with in vitro-transcribed sgRNA (20-120 Ug). Prior to transfec- 
tion, Cas9 protein in storage buffer (20 mM HEPES pH 7.5, 150 mM KCl, 
1 mM DTT, and 10% glycerol) was mixed with sgRNA in 1x NEB buffer 3 and 
incubated for 10 min at room temperature. A mixture of 1 x 10° protoplasts 
(or 5 x 10° protoplasts in the case of lettuce) re-suspended in 200 nl MMG 
solution was gently mixed with 5-20 wl of RNP complex and 210 ul of freshly 
prepared PEG solution (40% [w/v] PEG 4000; Sigma No. 95904, 0.2 M man- 
nitol and 0.1 M CaCl,), and then incubated at 25 °C for 10 min in darkness. 
After incubation, 950 uL W5 solution (2 mM MES [pH 5.7], 154 mM NaCl, 
125 mM CaCl, and 5 mM KCI) was added slowly. The resulting solution was 
mixed well by inverting the tube. Protoplasts were pelleted by centrifugation 
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at 100g for 3 min and re-suspended gently in 1 ml WI solution (0.5 M 
mannitol, 20 mM KCl and 4 mM MES (pH 5.7)). Finally, the protoplasts 
were transferred into multi-well plates and cultured under dark conditions 
at 25 °C for 24-48 h. Cells were analyzed one day after transfection. 


Protoplast regeneration. RNP-transfected protoplasts were re-suspended in 
0.5x B5 culture medium? supplemented with 375 mg/l CaCl)e2H,0,18.35 mg/l 
NaFe-EDTA, 270 mg/l sodium succinate, 103 g/l sucrose, 0.2 mg/I 2,4-dichlo- 
rophenoxyacetic acid (2,4-D), 0.3 mg/l 6-benzylaminopurine (BAP) and 
0.1 g/l MES. The protoplasts were mixed with a 1:1 solution of 0.5x B5 
medium and 2.4% agarose to a culture density of 2.5 x 10° protoplasts/ml. The 
protoplasts embedded in agarose were plated onto 6-well plates, overlaid with 
2 ml of liquid 0.5x B5 culture medium, and cultured at 25 °C in darkness. 
After 7 days, the liquid medium was replaced with fresh culture medium. 
The cultures were transferred to the light (16 h light [30 tmol m~? s~!] and 
8 h darkness) and cultured at 25 °C. After 3 weeks of culture, micro-calli 
grown to a few millimeters in diameter were transferred to MS regeneration 
medium supplemented with 30 g/l sucrose, 0.6% plant agar, 0.1 mg/l 
a-naphthalaneacetic acid (NAA), 0.5 mg/I BAP. Induction of multiple lettuce 
shoots was observed after about 4 weeks on regeneration medium. 


Rooting, transfer to soil and hardening of lettuce. To regenerate whole 
plants, proliferated and elongated adventitious shoots were transferred to 
a fresh regeneration medium and incubated for 4-6 weeks at 25 °C in the 
light (16 h light [150 mol m~? s~!] and 8 h darkness). For root induction, 
approximately 3-5-cm-long plantlets were excised and transferred onto a solid 
hormone-free 0.5x MS medium in Magenta vessels. Plantlets developed from 
adventitious shoots were subjected to acclimation, transplanted to potting soil, 
and maintained in a growth chamber at 25 °C (under cool-white fluorescent 
lamps with a 16-h photoperiod). 


T7E1 assay. Genomic DNA was isolated from protoplasts or calli using DNeasy 
Plant Mini Kit (Qiagen). The target DNA region was amplified and subjected 
to the T7E1 assay as described previously!®. In brief, PCR products were 
denatured at 95 °C and cooled down to a room temperature slowly using 
a thermal cycler. Annealed PCR products were incubated with T7 endonu- 
clease I (ToolGen, Inc.) at 37 °C for 20 min and analyzed via agarose gel 
electrophoresis. 


RGEN-RELP. The RGEN-RELP assay was performed as previously described’. 
Briefly, PCR products (300-400 ng) were incubated in 1x NEB buffer 3 for 
60 min at 37 °C with Cas9 protein (1 Ug) and sgRNA (750 ng) in a reaction 
volume of 10 pl. RNase A (4 Ug) was then added to the reaction mixture and 
incubated at 37 °C for 30 min to remove the sgRNA. The reaction was stopped 
by adding 6x stop solution (30% glycerol, 1.2% SDS, 250 mM EDTA). DNA 
products were electrophoresed using a 2.5% agarose gel. 


Targeted deep sequencing. The on-target and potential off-target sites were 
amplified from genomic DNA. Indices and sequencing adaptors were added by 
additional PCR. High-throughput sequencing was performed using lumina 
MiSeq (v2, 300-cycle). 
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Genetic modification of the diarrhoeal pathogen 


Cryptosporidium parvum 


Sumiti Vinayak'*, Mattie C. Pawlowic!*, Adam Sateriale!, Carrie F. Brooks!, Caleb J. Studstill', Yael Bar-Peled', 


Michael J. Cipriano! & Boris Striepen’? 


Recent studies into the global causes of severe diarrhoea in young 
children have identified the protozoan parasite Cryptosporidium as 
the second most important diarrhoeal pathogen after rotavirus’ ’. 
Diarrhoeal disease is estimated to be responsible for 10.5% of overall 
child mortality*. Cryptosporidium is also an opportunistic pathogen 
in the contexts of human immunodeficiency virus (HIV)-caused 
AIDS and organ transplantation®*®. There is no vaccine and only a 
single approved drug that provides no benefit for those in gravest 
danger: malnourished children and immunocompromised patients”*. 
Cryptosporidiosis drug and vaccine development is limited by the poor 
tractability of the parasite, which includes a lack of systems for con- 
tinuous culture, facile animal models, and molecular genetic tools*’. 
Here we describe an experimental framework to genetically modify 
this important human pathogen. We established and optimized trans- 
fection of C. parvum sporozoites in tissue culture. To isolate stable 
transgenics we developed a mouse model that delivers sporozoites 
directly into the intestine, a Cryptosporidium clustered regularly inter- 
spaced short palindromic repeat (CRISPR)/Cas9 system, and in vivo 
selection for aminoglycoside resistance. We derived reporter parasites 
suitable for in vitro and in vivo drug screening, and we evaluated the 
basis of drug susceptibility by gene knockout. We anticipate that the 
ability to genetically engineer this parasite will be transformative for 
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Figure 1 | Transfection of C. parvum. a, Schematic overview. C. parvum 
sporozoites were prepared from oocysts purified from infected calves and 
electroporated in the presence of plasmid DNA before infection of HCT-8 cells 
(Eno, flanking sequence from the C. parvum enolase gene). b-j, Luminescence 
measurements (the means of three technical replicates, standard deviation 
(s.d.) shown as error bars) of C. parvum (b-e, h-j, blue), T. gondii (f), or human 
HCT-8 cells (g) transfected with Nluc expression plasmids. b-d, C. parvum 
transfection requires electroporation (b) of DNA (c) into parasites (d). 

e, f, h, Transfection also requires plasmids to carry parasite-specific promoter 
sequences (e, f; testing C. parvum (Cp) and T. gondii (Tg) promoters in both 
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Cryptosporidium research. Genetic reporters will provide quantitative 
correlates for disease, cure and protection, and the role of parasite 
genes in these processes is now open to rigorous investigation. 
Cryptosporidium infection occurs through faecal oral transmission 
of the environmentally resilient oocyst. The oocyst shelters four spor- 
ozoites that emerge in the small intestine and invade the epithelium. 
Although there is no tissue culture system for continuous passage, 
C. parvum development can be observed for 2-3 days by infecting 
human ileocaecal adenocarcinoma cells (HCT-8)"°. To achieve trans- 
fection, sporozoites were excysted from oocysts purified from the faeces 
of experimentally infected calves using a protocol that mimics stomach 
and intestinal passage’’, and then electroporated before infection of 
HCT-8 cells (Fig. 1a). The transfection plasmids used here flanked a 
variety of reporter genes with candidate C. parvum 5' and 3’ regulatory 
sequences derived from highly expressed housekeeping genes. We 
observed significant reporter activity 48 h after transfection using plas- 
mids carrying nanoluciferase (Nluc; Fig. 1b), a small ATP-independent 
enzyme from deep sea shrimp”, but not firefly luciferase or fluorescent 
proteins. Nluc luminescence correlated with the number of parasites 
and the amount of DNA used for transfection. Luminescence was also 
shown to require the presence of parasite-specific promoter elements 
and the introduction of DNA into parasites and not host cells (Fig. 1). 
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parasites), and is susceptible to the Cryptosporidium drug nitazoxanide 

(h). g, Lipofection of HCT-8 cells with the original Nluc plasmid pNL1.1 
(Promega), but not derived parasite vectors, results in luciferase activity in the 
host alone. Choice of promoter (i; enolase (Eno), aldolase (Aldo), «-tubulin 5’ 
regions (Tub) (the 3’ untranslated region (UTR) was uniformly from the 
enolase gene)) or codon composition (j; Nluc optimized to 35% GC (oNluc)) 
influences expression level in C. parvum. Note automatic gain adjustment of 
luminescence measurements; units are not comparable between panels. 
Independent biological experiments were repeated three times, and 
representative data are shown. 
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Figure 2 | Luciferase assays for C. parvum drug resistance and 
CRISPR/Cas9 activity. a, HCT-8 cells were infected with Nluc-transfected 
sporozoites and grown for 2 days in the presence of paromomycin. b, Trans- 
lational fusions were constructed placing Neo at the amino or carboxy terminus 
of Nluc. Nluc-Neo shows luciferase activity, albeit at a reduced level when 
compared to Nluc alone. ¢, C. parvum transfected with Nluc (blue) or Nluc— 
Neo (red) were grown in different concentrations of paromomycin. Luciferase 
activity for each plasmid was normalized to its drug-free level. d, CRISPR/Cas9 
plasmid for C. parvum. Flag, epitope tag; nls, nuclear localization signal; 

ribo, ribosomal protein L13A 3’ UTR; u6, newly annotated promoter 
CM000433:553110-553472. e, g, Outline (e) and sequences (g) for Nluc repair 
assay. Guide RNA target, blue; protospacer adjacent motif, green; mutagenized 
codon 18, red. f, Sporozoites were transfected with Nluc or a codon 18 
termination mutant (Dead Nluc); note ablation of signal. In addition to the 
Dead Nluc plasmid, some parasites also received a 125 bp double-stranded 
repair DNA fragment, and the Cas9 plasmid with the indicated guide RNAs 
(gRNAs; no target, empty gRNA cassette; off target, GFP gRNA; on target, Nluc 
gRNA). Statistical analysis compares Dead Nluc alone with Dead Nluc and 
Cas9 and specific gRNA. Note significant Cas9-mediated restoration of 
luciferase activity (***P = 0.0006, unpaired t-test). n = 3 technical replicates 
for a—c, and controls from f; n = 6 technical replicates for on-target samples 
in f. Error bars are s.d. and all experiments depicted here were repeated three 
times and representative data are shown. 


Furthermore, reporter signal was ablated by the anti-parasitic 
drug nitazoxanide. Transient transfection of C. parvum is inefficient 
(<10,000 fold when compared to the related apicomplexan Toxo- 
plasma gondii in parallel experiments) and requires a highly sensitive 
reporter such as Nluc to be noticeable. 

In an effort to enhance efficiency we evaluated different electropora- 
tion devices, electrical wave programs and buffer compositions 
(Extended Data Fig. 1); this produced tenfold enhancement. We tested 
flanking sequences from different C. parvum genes and identified the 
enolase promoter to be strongest. The C. parvum genome is AT rich 
and shows strong codon bias'’. We also noted a preference for A over T 
within the first 20 codons and thus explored codon optimization and 
found sixfold enhancement (Fig. 1)). 

To enable enrichment of transgenic parasites, we next explored the 
selection of drug resistance. The aminoglycoside antibiotic paromomy- 
cin does not cure cryptosporidiosis in people, but is effective in tissue 
culture (Fig. 2a) and in immunocompromised mice’*. Work in other 
protist models has shown aminoglycoside phosphotransferases to 
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Figure 3 | Mouse model for selection of stable C. parvum transgenics. 

a, Outline of the selection strategy. Transfected sporozoites were injected into 
the small intestine by surgery (Extended Data Fig. 2) and mice were treated 
with paromomycin. Oocysts were purified from the faeces and used to infect 
cultures or mice by oral gavage. b, Quantitative PCR of C. parvum DNA isolated 
from faeces of mice infected with transfected sporozoites (four mice per group) 
and treated as indicated. Emergence of paromomycin resistance required the 
Nluc-Neo and Cas9 plasmids. c, d, Upon reinfection, parasites show strong drug 
resistance (c) and luciferase activity (d). In repeat experiments we noted that 
luciferase is detectable as early as 6 days after transfection in the faeces of the first 
infected mouse (Extended Data Fig. 4). e, Protein extracts from oocysts were 
analysed by SDS-polyacrylamide gel electrophoresis (SDS-PAGE) and western 
blot using an antibody against Neo (rabbit anti-neomycin phosphotransferase 
II; EMD Millipore). Predicted molecular mass of the Nluc-Neo fusion protein is 
48.3 kDa. f, Immunofluorescence staining using anti-Neo (mouse anti-Neo; 
Alpha Diagnostic International) and C. parvum (tryptophan synthase B) 
antibodies. Note multiple nuclei in 4’,6-diamidino-2-phenylindole (DAPI) stain 
typical for C. parvum meronts. No anti-Neo staining was observed in wild-type 
parasites. g, Luciferase assays for HCT-8 cultures infected with wild-type (WT; 
blue) and transgenic (Nluc-Neo; red) parasites. The y-axis is split to show level 
of luminescence background. n = 3 technical replicates, error bars are s.d., the 
experiment was done twice. h, Ninety-six-well luciferase drug assay using 
1,000 oocysts per well. Note significant growth inhibition on treatment with 
10 LM nitazoxanide (**P = 0.0036, unpaired t-test). n = 3 technical replicates, 
error bars are s.d., the experiment was repeated two times and representative 
data are shown. 
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confer resistance to paromomycin’*’*. Appreciation of C. parvum drug 
resistance in culture is complicated by the lack of continuous growth. 
We thus constructed translational fusions between the Nluc reporter 
and the neomycin resistance marker (Neo)'* to focus our observation 
on the small subset of transfected parasites. Luciferase activity in para- 
sites expressing Nluc-Neo showed reduced susceptibility to paromo- 
mycin treatment compared to Nluc alone (Fig. 2c), and thus we 
concluded that Nluc—Neo confers drug resistance in this transient assay. 

Our genome searches indicated that Cryptosporidium species lack 
non-homologous end joining DNA repair. This suggested transgene 
integration to be rare and to require homologous recombination'””’. 
Such recombination can be enhanced by long flanking regions and/or 
double-strand breaks introduced by restriction enzymes, transcription 
activator-like effector nucleases (TALENs) or CRISPR/Cas9 (refs 18, 19). 
To build a C. parvum CRISPR/Cas9 system, we constructed a plasmid in 
which the C. parvum U6 RNA promoter drives a guide RNA cassette” 
and the Streptococcus pyogenes Cas9 gene”’ is flanked by parasite regu- 
latory sequences (Fig. 2d). To test this system, we conducted a Cas9- 
dependent DNA repair experiment (Fig. 2e-g). We introduced a stop 
codon into the Nluc reporter that ablated luciferase activity (Dead Nluc). 
We then targeted the dead gene with a guide RNA, and provided a short 
double-stranded template for repair that restores read-through trans- 
lation and renders the repaired gene resistant to further Cas9 cutting. 
When C. parvum sporozoites are co-transfected with a specific guide, 
luciferase activity is restored (P = 0.0006, unpaired t-test). No change is 
observed with no or off-target guides. 

Interferon-y knockout mice are susceptible to C. parvum infection 
through oral inoculation of oocysts”. However, infection with free 
sporozoites is less effective’, probably due to stomach passage. We 
developed a surgical protocol to inject transfected sporozoites directly 
into the small intestine to maximize infection (Extended Data Fig. 2). 
When mice were killed 24 h after infection, luciferase activity was 
observed in scrapings of the intestinal epithelium. We also established 
an effective treatment protocol using paromomycin supplementation 
of the drinking water (Extended Data Fig. 3). 

Next, we infected mice by surgery with transfected sporozoites and 
treated them with paromomycin as indicated (Fig. 3 and Extended Data 
Fig. 4; four mice per group). Faeces were collected every 3 days and 
oocyst shedding was measured by quantitative polymerase chain reac- 
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tion (PCR) targeting the C. parvum 18S ribosomal RNA locus. Mice 
infected with parasites transfected with the Nluc—Neo plasmid that did 
not receive drug shed high numbers of oocysts and remained infected 
for the 30 days observed (Fig. 3b, blue). Those infected with parasites 
that received the Nluc plasmid (lacking the Neo gene; Fig. 3b, green) 
were rapidly cured by drug treatment. Those transfected with Nluc-Neo 
alone and drug treated were also cured (infection may persist slightly 
longer). In contrast, infection with parasites carrying the Nluc-Neo 
plasmid and the Cas9 plasmid (Fig. 3b, red; Cas9 target detailed later) 
rapidly rebounded to levels similar to untreated mice. Oocysts emerging 
from selection were purified from faeces and used to infect mice that 
were again treated with paromomycin; wild-type oocysts were used in 
parallel (100,000 oocysts per mouse by gavage). While paromomycin 
treatment cured infection with wild-type parasites, transgenic parasites 
showed immediate robust drug resistance (Fig. 3c). When these oocysts 
were probed by western blot with anti-Neo antibody, we detected a band 
consistent with an Nluc-Neo fusion protein. 

Purified oocysts were also used to infect cell cultures, and processed 
for immunofluorescence after 2 days. Transgenic but not wild-type 
intracellular parasite stages showed fluorescence when probed with 
antibodies specific for either Neo or Nluc (Fig. 3fand data not shown). 
These cultures also displayed strong luciferase activity not observed in 
wild type. This activity exceeded that previously observed in transient 
transfection experiments by five orders of magnitude on a per-cell 
basis. We assessed whether these organisms could be suitable for 
drug-screening assays by infecting 96-well plates with 1,000 oocysts 
per well and measured luciferase after 48 h. Infected wells were clearly 
distinguishable from uninfected wells (z’ > 0.6; n = 20). Similarly, 
wells treated with nitazoxanide showed significant growth inhibition 
(P = 0.0036, unpaired t-test). Luciferase also provided a convenient 
way to assess the infection state of animals. We sampled 10 mg of 
faeces from mice diagnosed in parallel by PCR and found this assay 
to be sensitive, specific and faster than PCR (Fig. 3d). We note that 
Nluc expression remains stable when parasites are propagated in mice 
in the absence of paromomycin (Extended Data Fig. 5). 

Cryptosporidium is remarkably resistant to antifolates, a mainstay of 
treatment against other apicomplexans, and this resistance has been 
attributed to differences in the target enzyme dihydrofolate reductase- 
thymidylate synthase (DHFR-TS)**. However, Cryptosporidium is 
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Figure 4 | Targeted deletion of C. parvum TK. a, Owing to a horizontal gene 
transfer, C. parvum has two pathways to synthesize dTMP: TK and DHFR-TS. 
DHE, dihydrofolic acid; THF, tetrahydrofolic acid; (UMP, uridine 
monophosphate. b, Map of the C. parvum TK locus, the targeting plasmid and 
the predicted modified locus. Primers and amplicon sizes of diagnostic PCR 
products are indicated (Ins, insertion). c, PCR analysis using genomic DNA 
from wild-type (WT) and transgenic parasites (Nluc-Neo, oocysts purified 
from faeces of infected mice shown in Fig. 3c; CDS, coding sequence). Primer 
sequences are provided in Supplementary Table 1. e, Quantification of 
EdU-labelling experiments (meronts with four or more nuclei were scored, two 
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biological repeats, n = 105 each sample, error bars are s.d.), d, Representative 
fluorescence micrographs are shown. Antibody to C. parvum tryptophan 
synthase B was used to identify parasites (green). f, Trimethoprim treatment of 
wild-type (blue) and Nluc-Neo transgenic (red) parasites. Wild-type parasites 
were measured in transient transfection assays with Nluc plasmid (n = 3, 
technical replicates, error bars are s.d.). The assay shown was conducted in the 
presence of 10 1M thymidine to avoid indirect host cell toxicity*® (experiments 
without thymidine produced indistinguishable results). Experiments were 
repeated three times and representative data are shown. 


23 JULY 2015 | VOL 523 | NATURE | 479 
mited. All rights reserved 


LETTER 


unique among apicomplexans in that it acquired a thymidine kinase 
(TK) by horizontal gene transfer from bacteria”. We hypothesized that 
TK may also contribute to Cryptosporidium antifolate resistance by 
providing an alternative route to thymidine monophosphate (dTMP; 
Fig. 4a). For this reason, the TK locus was targeted for insertion, 
allowing us to test this hypothesis by gene disruption. We mapped 
the locus in stable transgenic parasites by PCR using primers that link 
the marker genes with genomic sequences beyond the flanking regions 
on the targeting construct. This mapping is consistent with insertion 
by homologous double crossover (Fig. 4b, c). Furthermore, the TK 
coding sequence is no longer detectable, indicating uniform loss of 
the gene in the selected population. We tested for DNA incorporation 
of the thymidine analogue 5-ethynyl-2’-deoxyuridine (EdU) using 
click chemistry and fluorescence microscopy”. Wild-type parasites 
grown in the presence of EdU show fluorescent nuclei. This labelling 
is lost in the transgenic parasites (Fig. 4d, e), confirming loss of TK at 
the biochemical level. We next treated parasite infected cultures with 
the antifolate trimethoprim. We confirmed the previously observed 
resistance in wild-type parasites, but noted enhanced susceptibility in 
the mutants (Fig. 4f). We conclude that the C. parvum TK is a non- 
essential enzyme required for the activation of thymidine, and that its 
presence limits the efficacy of antifolate therapy in Cryptosporidium. 

We show that major hurdles towards genetic analysis and manipula- 
tion for cryptosporidiosis can be overcome by maximizing the efficiency 
of each step of the process and by focusing on in vivo propagation 
and selection. There is an urgent need for new anti-parasitic drugs’. 
Cryptosporidium is not susceptible to drugs widely used against related 
pathogens, which reflects substantial differences in its metabolism and 
metabolite uptake’’. Luciferase reporter parasites enable phenotypic 
screening in culture and animals with sufficient sensitivity and specifi- 
city to warrant a comprehensive effort to discover novel compounds. 
Gene deletion now permits biological target validation. Genetic modi- 
fication may also allow the construction of attenuated parasites as a 
potential oral vaccine. While infants and toddlers are highly susceptible 
to the disease, infection is rarely detected in older children’*. This is 
consistent with infection studies in people and animals suggesting the 
development of anti-parasitic and anti-disease immunity***”’. A better 
understanding of the mechanisms underlying disease and protection 
will be required to design and produce such a vaccine. 


Online Content Methods, along with any additional Extended Data display items 
and Source Data, are available in the online version of the paper; references unique 
to these sections appear only in the online paper. 
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METHODS 


C. parvum reporter and drug resistance vectors. C. parvum transfection vectors 
were derived from plasmid pH;BG” and modified to contain C. parvum promoter 
and 5’ and 3’ untranslated messenger RNA regions. We mined the genome and a 
variety of expression data sets collectively available through Crypto DB (http:// 
www.cryptoDB.org)”’ to identify genes that are highly expressed across the life- 
cycle. Promoters and 5’ UTRs of the enolase (cgd5_1960), «-tubulin (cgd4_2860), 
and aldolase (cgd1_3020) genes and 3’ UTRs of enolase (51 bp), #-tubulin (97 bp) 
or ribosomal protein L13A (cgd5_970, UTR 211 bp) were amplified from genomic 
DNA by PCR (see Supplementary Table 1 for a list of primer sequences and 
restriction sites used). Nluc was amplified from pNL1.1 (Promega Corporation), 
firefly luciferase and different fluorescent protein genes were amplified from vec- 
tors used for T. gondii****. The neomycin resistance gene was amplified from 
plasmid pNeo4 (ref. 15) (a gift from J. Gaertig, University of Georgia) and intro- 
duced 5’ or 3’ of Nluc ina plasmid with enolase regulatory sequences. To target the 
TK gene, regions flanking the gene were amplified and introduced into the Nluc- 
Neo vector (the promoter but not the 3’ UTR was retained). 

C. parvum CRISPR/Cas9 genome editing. Human codon-optimized 
Streptococcus pyogenes Cas9 (hSpCas9) carrying a Flag tag and N- and 
C-terminal nuclear localization signals was amplified from pX330 (ref. 35) and 
introduced into the Aldolase-Nluc-ribo vector replacing the Nluc. A guide RNA 
cassette was synthesized containing the C. parvum U6 promoter identified by 
genome searches using known structural RNA sequences from Plasmodium 
falciparum”, two inverted BbsI restriction sites to facilitate guide cloning, a 
trans-activating CRISPR RNA (tracrRNA) consensus sequence and a terminator 
(poly T) sequence, and was introduced into the Cas9 plasmid. 

To test for CRISPR/Cas9-mediated repair in vitro, we modified the codon- 
optimized Nluc vector by introducing a premature stop codon (Y18Stop) adjacent 
to a guide target sequence at the beginning of the gene by site-directed mutagenesis 
(QuikChange II, Agilent Technologies). A 125 bp double-stranded (ds)DNA oli- 
gonucleotide was synthesized that restored Y18 and disrupted the PAM motif 
(G17A) of the guide RNA target, thus rendering it resistant to further Cas9 cuts. 
Parasite excystation and transfection. Oocyst excystation was carried out as 
described" with some modification. Up to 10° C. parvum Iowa strain oocysts 
(Sterling Parasitology Laboratory or Bunch Grass Farm) were suspended in 
100 ul of 1:4 aqueous dilution of 5.25% sodium hypochlorite and incubated on 
ice for 5 min. Oocysts were then washed three times with ice-cold PBS, suspended 
at 3.9 X 10° oocysts per ml of 0.2 mM sodium taurocholate (prepared in PBS) and 
incubated at 15°C (10 min) and then at 37 °C (60-90 min). Emergence of spor- 
ozoites was monitored microscopically (typical efficiency 70-90%). Sporozoites 
were filtered through a 3 1M polycarbonate filter to remove unexcysted oocysts, 
washed with ice-cold PBS, and counted. 

Initially we used a BTX ECM 630 device for electroporation (Harvard 
Apparatus). Excysted sporozoites (10”) were suspended in complete cytomix buf- 
fer (120mM KCl, 0.15mM CaCl, 10mM K,HPO,/KH,P04, pH 7.6, 25mM 
HEPES, pH 7.6, 2mM EGTA, 5mM MgCh, pH 7.6, supplemented with 2mM 
ATP and 5 mM glutathione), mixed with plasmid DNA, and electroporated with a 
single 1,500 V pulse, resistance of 25 Q, and a capacitance of 25 1F. To enhance 
transfection efficiency, we switched to using the AMAXA Nucleofactor 4D device 
(Lonza Cologne GmbH). After excystation, 10’ sporozoites were suspended in 
15 pl Lonza SF Buffer and combined with 10-50 ug DNA (prepared in Tris- 
EDTA, pH 8.0) at a final volume of 20 pl. The parasite-DNA mix was added to 
small, strip cuvettes and electroporated using program EH100. Additional elec- 
troporation conditions were explored to arrive at this protocol and those are listed 
in Extended Data Fig. 1. 

For in vitro transfection assays, human ileocaecal adenocarcinoma (HCT-8) 
cells (ATCC) were grown in RPMI-1640 with glutamine supplemented with 10% 
FBS, 1 mM sodium pyruvate, 50 U ml penicillin, 50 pg ml’ streptomycin and 
amphotericin B in 24-, 48- or 96-well plates to 70% confluency. No effort was made 
to authenticate this cell line or test for mycoplasma. Prior to infection, media was 
replaced with DMEM with 2% FBS, 50 U ml“ penicillin, 50 1g ml“ streptomycin 
and amphotericin B, and 0.2 mM 1-glutamine. For in vivo experiments electro- 
porated sporozites were suspended in PBS and kept on ice until administered to 
the mice. 

The T. gondii Nluc plasmid was constructed by inserting the Nluc sequence into 
vector pCTHs (ref. 32) and parasites were electroporated and used to infect human 
foreskin fibroblasts as described*”. HCT-8 cells were cultured in 24-well plates 
until confluent, transfected with 500ng of DNA using Lipofectamine 2000 as 
described by the manufacturer (Life Technologies), and assayed for Nluc activity 
after 48 h. 

Animal ethics statement. Animal experiments were approved by the Institutional 
Animal Care and Use Committee of the University of Georgia (animal use pro- 
tocol no. A2012 03-028-Y3-A12). 
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Surgical delivery of transfected sporozites into IFN-y-deficient mice. In 
preliminary experiments we noted that antibiotic removal of bacterial flora 
enhances susceptibility of mice. Prior to infection mice were orally treated by 
gavage daily for a week before infection with an antibiotic cocktail (3 mg ampi- 
cillin, 3mg streptomycin, 0.95 mg metronidazole, 3mg neomycin and 1.5mg 
vancomycin in distilled H2O, per mouse/per day; all antibiotics purchased from 
Sigma). To deliver sporozoites directly to the small intestine, we developed a 
mouse survival surgery protocol for female C57BL/6 IFN-y-deficient mice 
(B6.129S7-Ifng"™""*/J, Jackson Laboratories) aged 6-8 weeks. The abdominal area 
of mice was shaved with clippers. Animals were placed in an isofluorane (3-5%) 
anaesthesia induction chamber and then moved to a nosecone (1-3% isofluorane 
as needed) on a sterile surgical field. A sterile drape was applied over a warming 
pad after sterilization of the area with 70% ethanol. Respiration and response to 
stimulation (toe pinch) were monitored during the procedure and the vaporizer 
adjusted as needed. Mucous membranes and footpads were monitored for colour 
to confirm adequate perfusion. Three betadine (Povidone-iodine) scrubs followed 
by a 70% ethanol wipe were applied to shaved skin before surgery. Ophthalmic 
ointment (Puralube, Dechra Veterinary Products) was applied to prevent drying 
of eyes. Skin was vertically incised midline of the abdominal region below the 
sternum with microsurgical scissors for approximately 1.5 cm followed by vertical 
incision of the peritoneum. Exposed jejunum/ileum was injected with 10’ trans- 
fected sporozoites suspended in 200 ul PBS containing sterile food colouring dye 
as tracer. After injection, suturing was performed to close the peritoneum. Mice 
were administered 0.01-0.02 ml per gram body weight of warm lactated Ringer’s 
solution subcutaneously after surgery. Meloxicam analgesic was also administered 
to the mice after surgery. At completion of the procedure, the eye ointment was 
wiped off and the vaporizer was turned off and the mice were allowed to breathe 
the oxygen supply gas until they began to wake. Mice were placed in a recovery 
area until ambulatory and exhibiting normal respiration and were watched for 2 h 
after surgery. Incision sites were monitored daily until fully healed (10-14 days). 
Twenty-four hours after surgical infection, water in mouse cages was replaced 
with distilled H,O containing 16mgml' paromomycin, a concentration we 
determined to deliver a daily dose of 40mgkg~' paromomycin to each mouse 
(Extended Data Fig. 3). Mice were randomly assigned to groups before surgery. 
A sample size of four animals per treatment group was judged to be sufficiently 
large enough to draw appropriate conclusions. All mice survived surgery and were 
included in the results reported here. Investigators were not blinded to group 
allocation during the experiments. 

Mouse faeces collection and storage. Faecal samples were collected from mice 
(typically four mice per cage) starting 3 days after infection every third day for up 
toa month. Mice were transferred to a fresh, sterile cage for 2-3 h, and faeces from 
the cage were collected, pooled, and stored at 4 °C. 

Luciferase assay. For transient transfection experiments, electroporated sporo- 
zoites were added to 70% confluent HCT-8 culture and infection was allowed to 
proceed at 37 °C for 48 h. Media was removed from wells and 200 pl of NanoGlo 
lysis buffer supplemented with NanoGlo substrate (1:50, Promega Corporation) 
was added to each well. Cells were scraped and the lysate was transferred to white 
96-well plates and luminescence was measured using a Synergy H4 Hybrid 
Microplate Reader (BioTek Instruments). For drug assays with purified stable 
transgenic oocysts, the culture supernatant was collected after 48 h from 96-well 
plates. An equal amount of supernatant and NanoGlo lysis buffer with substrate 
was combined and luminescence was measured. 

For luciferase measurement from mouse faecal samples, 20 mg of faeces was 

weighed into a 1.5-ml microcentrifuge tube and homogenized in 1 ml of lysis 
buffer (50mM Tris-HCl, 10% glycerol, 1% Triton-X, 2mM _ dithiothreitol 
(DTT), 2mM EDTA) using 10-15 glass beads (3mm) and a vortex mixer for 
1 min, followed by clarification of lysate by brief centrifugation. One-hundred 
microlitres of lysate was mixed with an equal volume of NanoGlo Luciferase Buffer 
(prepared with 1:50 dilution of substrate) and luminescence was measured as 
described. 
High-throughput imaging assay for parasite growth. For drug assays we used 
either luciferase activity or a 96-well infection and imaging protocol”* using a BD 
Pathway instrument. Parasites and host cells were quantified using an Image] 
macro adapted from ref. 39. The ratio of parasites to host nuclei was determined 
for each sample image and normalized to untreated controls. 

For oocyst quantification by high-throughput microscopy, we weighed collected 
mouse faeces and diluted in PBS (5 pl mg’). Samples were incubated at 95 °C for 
10 min, vortexing every 2 min at high speed. Large debris was allowed to settle for 
10 min, then 10 kl of the suspension were mixed with 990 tl PBS and 1 ul of fluor- 
escein isothiocyanate (FITC)-conjugated goat polyclonal anti-Cryptosporidium anti- 
body (GeneTex). After 1 h at room temperature, the sample was centrifuged at 2,000g 
for 15 min. The pellet was suspended in 200 pl PBS and transferred to a 96-well plate 
for microscopy. Plates were imaged using BD Pathway and oocysts were counted 
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using an ImageJ macro. Using a standard curve (uninfected mouse faeces spiked 
with known amounts of oocysts), oocyst counts were converted to oocysts per 
grams faeces. 

Quantification of oocyst shedding using qRT-PCR. DNA was extracted from 
100 mg faeces using ZR Faecal DNA MiniPrep Kit (Zymo Research Corporation) 
following the manufacturer’s protocol with slight modification. While in lysis 
buffer, the sample was freeze-thawed in liquid nitrogen five times before the first 
centrifugation step. Each sample was eluted in 50 il water, 1 jl of eluate was used 
for qRT-PCR along with 10 1M primers targeting Cryptosporidium 18S rRNA“ 
and SYBR Master Mix (Life Technologies) for detection. Each qRT-PCR reaction 
was normalized using an eight-point standard curve (faecal DNA purified from 
uninfected mouse faeces spiked with known amounts of oocysts) for each set 
of samples. 

Oocyst purification from mouse faeces. Oocysts were purified from faeces using 
sucrose suspension followed by a caesium chloride centrifugation’. Mouse faeces 
were suspended in tap water, passed through a 850-j1m mesh filter, followed by 
250-uum mesh. This filtered suspension was mixed 1:1 with aqueous sucrose solu- 
tion (specific gravity 1.33), and centrifuged at 1,000g for 5 min. Oocysts were 
collected from the supernatant and suspended in 0.85% saline solution. 0.5 ml 
of this preparation was overlaid onto 0.8 ml of 1.15 specific gravity CsCl, and 
centrifuged for 3 min at 16,000g. Oocysts were collected from the top ml of the 
sample, washed in 0.85% saline, counted with disposable counting chamber 
(KOVA International) and suspended in 2.5% potassium dichromate for storage 
at 4°C, 

Western blotting. For western blot analysis, oocysts from wild-type and transgenic 
Nluc-Neo parasites were excysted as described earlier and sporozoites were lysed in 
SDS sample buffer. Protein extract from 10’ sporozoites was loaded per lane and 
subjected to electrophoresis on a precast Any kD Mini-PROTEAN TGxX gel (Bio- 
Rad) followed by transfer to 0.2-[1m nitrocellulose membrane (Bio-Rad). Blots were 
blocked and probed with an anti-neomycin phosphotransferase II antibody (EMD 
Millipore) at 1:1,000 dilution and goat anti-rabbit IgG (H + L)-HRP conjugate 
(Bio-Rad) at 1:20,000 dilution followed by detection with ECL Western Blotting 
Substrate (Thermo Pierce) and exposure to film. Equal loading of blots was con- 
trolled by stripping and reprobing with an antibody to «-tubulin. 

EdU labelling and immunofluorescence microscopy. EdU labelling was per- 
formed using the Click-iT EdU Alexa Fluor 594 Imaging Kit following the man- 
ufacturer’s instructions (Life Technologies). Purified stable transgenic oocysts 
expressing the luciferase or wild-type oocysts were inoculated into 24-well plates 
containing coverslips confluent with HCT-8 cells. After 24h, EdU was added to 
the media at 10 uM and left for 18h before fixation. For immunofluorescence, 
primary antibodies used were mouse monoclonal anti-human neomycin phos- 
photransferase II (NPII) (Alpha Diagnostic International), rabbit polyclonal 


anti-Nluc antibody (Promega Corporaton), and polyclonal rabbit anti-C. parvum 
tryptophan synthase B (TrpB; B.S., unpublished observations) at 1:1,000, second- 
ary antibodies were anti-mouse or anti-rabbit conjugated to Alexa488 or Alexa546 
(Molecular Probes, Life Technologies) at a dilution of 1:1,000. DNA was visualized 
with DAPI (2mgml7'). Images were collected on an Applied Precision Delta 
Vision inverted epifluorescence microscope at the UGA Biomedical Microscopy 
Core, deconvolved and adjusted for contrast using SoftWoRx software. 
Statistical methods. All bar graphs depict the mean with standard deviations 
shown as error bars. Unless indicated otherwise, graphed data represent three 
technical replicates; each experiment was repeated at least twice and representative 
data are shown. No statistical tests were used to predetermine sample size. 
Unpaired t-tests were used appropriately to determine statistical significance 
and a P value <0.05 was considered significant. Assumptions for statistical tests 
were confirmed or corrected as described. No animals were excluded from experi- 
mental measurements. 
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Extended Data Figure 1 | Optimization of sporozoite transfection. a, 
Ten-million sporozoites prepared in either cytomix (BTX) or Lonza Buffers SE, 
SF or SG (4D Nucleofection) were combined with 10 ug DNA (Eno_Nluc-GS- 
Nluc_Eno). Samples were electroporated using previously determined 
settings for BTX (1,500 V, 25 Q, 25 LF) or various program settings for 4D 
Nucleofection as indicated. Parasites were added to cultures of HCT-8 cells and 
luciferase activity was read after 48 h. Bars represent average of two technical 
replicates. b, Transfection was further optimized by comparing the best 
preliminary settings (buffers SF and SG; programs EH 100 and EO 100) with 
additional pulse programs as indicated. Transfection was carried out as in 


a. Bars represent average of two technical replicates. c, Electroporation systems 
(BTX and 4D Nucleofection) were compared using the same number of 

C. parvum sporozoites and quantities of DNA using buffers and conditions 
optimized in a and b. Bars represent average of three technical replicates. Note 
about tenfold enhancement of transient transfection using 4D Nucleofection. 
The impact of electroporation on stable transformation cannot be assessed 

in this setup and may be higher. Experiments in a and b were done once for 
the purpose of optimization, while c was repeated three times; a single 
representative experiment is shown. 
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Extended Data Figure 2 | Direct surgical injection of transfected C. parvum _ PBS containing 10’ transfected C. parvum sporozoites is injected into the 
sporozoites into the small intestine. Mice are shaved and anaesthetized lumen. The peritoneum and the abdominal skin are each sutured with 
with isofluorane (3% initially, then maintained at 1.5% for the surgery). The 4-0 polydioxanone and mice are injected with meloxicam (1 mgkg_ ') 
abdominal skin is disinfected with Betadine and a small incision is made into _ subcutaneously. Each procedure takes around 15 min, and mice recover 
the peritoneum. Forceps are used to grasp the small intestine and 100 pil of rapidly. 
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Extended Data Figure 3 | Optimization of paromomycin treatment of 
infected mice. a, Dosing of mice accounting for drug concentration, animal 
weight, and measured daily water consumption. At 16mg ml ‘ each mouse 
received 40 mg paromomycin daily (dotted line). b, This dose was found to be 
sufficient to decrease oocyst shedding in treated mice to background. By day 7 
mice without paromomycin treatment shed large amounts of oocysts when 
compared to untreated mice. Treated mice showed no shedding above 
background. Oocysts were enumerated by high-throughput imaging assay. Five 
mice were analysed individually with two technical replicates. 
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Extended Data Figure 4 | Mouse model for selection of stable C. parvum 
transgenics. Repeat of the experiment described in Fig. 3b. a, Measurement of 
C. parvum infection using faecal PCR. b, Luminescence measurements. Note 
increasing luminescence from day 6 in parasites that received resistance 

and Cas9 plasmids. Mice were infected in groups of four per cage and pooled 
faeces was analysed for each cage (each measurement represents three 
technical replicates). 
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Extended Data Figure 5 | C. parvum maintains the stable transgene when 
passed serially in mice without paromomycin treatment. a, Mice were 
infected orally with 100,000 transgenic oocysts. b, c, Infected mice were then 
treated with paromomycin (b) or left untreated (c). Oocysts were purified from 
faecal collections by sucrose flotation and CsCl centrifugation, and used to 
infect a second cohort of mice. Again, each mouse received 100,000 transgenic 
oocysts and mice were treated or not. Faeces were tested for luminescence 
every 3 days. Each reading represents the pooled faecal sample from five mice 
with three technical replicates. 
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The insertion of precise genetic modifications by genome 
editing tools such as CRISPR-Cas9 is limited by the relatively 
low efficiency of homology-directed repair (HDR) compared 
with the higher efficiency of the nonhomologous end-joining 
(NHEJ) pathway. To enhance HDR, enabling the insertion of 
precise genetic modifications, we suppressed the NHEJ key 
molecules KU70, KU80 or DNA ligase IV by gene silencing, 
the ligase IV inhibitor SCR7 or the coexpression of adenovirus 
4 E1B55K and E4orf6 proteins in a ‘traffic light' and other 
reporter systems. Suppression of KU70 and DNA ligase IV 
promotes the efficiency of HDR 4—5-fold. When co-expressed 
with the Cas9 system, E1B55K and E4orf6 improved the 
efficiency of HDR up to eightfold and essentially abolished 
NHE4J activity in both human and mouse cell lines. Our 
findings provide useful tools to improve the frequency of 
precise gene modifications in mammalian cells. 


The CRISPR-Cas9 systems (clustered, regularly interspaced, short 
palindromic repeats (CRISPR)-CRISPR-associated protein) repre- 
sent a versatile tool for genome engineering, enabling the induction 
of site-specific genomic double-strand breaks (DSBs) by single guide 
RNAs (sgRNAs)!. In mammalian cells DSBs are mostly repaired by the 
nonhomologous end-joining (NHEJ) pathway”, frequently leading to 
the loss of nucleotides from the ends of DSBs. This enables the efficient 
construction of knockout alleles through the induction of frameshift 
mutations‘. By contrast, the alternative pathway of homology-directed 
repair (HDR) can be used for the introduction of precise genetic modi- 
fications such as codon replacements or reporter insertions by recom- 
bination with exogenous targeting vectors, serving as repair template?. 
We reasoned that the efficiency of HDR and thus the construction of 
precise genetic modifications could be boosted by the transient inhi- 
bition of NHEJ key molecules, similar to what has been observed for 
Drosophila embryos with a genetic DNA ligase IV deficiency®. 

To quantitatively determine the outcome of CRISPR-Cas-induced 
DSB repair, we first generated human HEK293 cells with a ‘traffic 
light’ reporter’ (TLR) vector integrated into the adeno-associated 


virus integration site 1 (AAVS1) locus® (Fig. 1a). HEK293 cells were 
transfected with an AAVS1 targeting vector carrying the TLR insert 
and expression plasmids for Cas9 and an AAVS1-specific ssRNA 
(Supplementary Fig. 1). Upon selection and genotyping of trans- 
fected cells, we obtained heterozygous (AAVS1'"®/+) and homozy- 
gous (AAVS1™R/TLR) targeted clones harboring the TLR construct 
in the AAVS1 locus (Supplementary Fig. 2). The reporter includes 
a CAG promoter for expression of a nonfunctional green fluorescent 
(Venus) gene, disrupted by the replacement of codons 117-152 with 
target sequences from the mouse Rosa26 and Rab38 locus, followed 
by coding regions for a self-cleaving 2A peptide and a red fluorescent 
(TagRFP) gene in a reading frame shifted by 2 bp (Supplementary 
Table 1). CRISPR-Cas9-induced DSBs in the target region that are 
repaired by means of NHEJ and cause deletions shift the translation 
to the frame of the 2A-TagRFP in about 1/3 of the mutagenic NHEJ 
events; this can be detected in reporter cells by the expression of RFP 
(Fig. 1b). If an intact Venus coding region is provided as repair tem- 
plate, cells that repair the DSBs by HDR express Venus. For activation 
of the TLR reporter, we designed two sgRNAs against the Rosa26 target 
sequence, of which sgRosa26-1 showed a higher activity to induce 
deletions in the endogenous locus of mouse NIH3T3 cells (Fig. 1c). 

Next, we transfected AAVS17!®’* cells with an expression vector for 
Cas9, blue fluorescent protein (BFP) and sgRosa26-1, together with a 
linearized Venus donor plasmid. 72 h after transfection the cells were 
analyzed by FACS, gated for BFP* transfected cells, and the frequency 
of Venus* and of RFP* cells was determined. We observed 3% RFP* and 
5% Venus* cells, indications of NHEJ or HDR repair events, respectively, 
as compared to 0.1% RFP* and 0.6% Venus* cells in a control lacking 
sgRosa26-1 (Fig. 1d). Of note, RFP* cells detected by the TLR assay 
represent only 1/3 of all mutagenic NHEJ events. Similar results were 
obtained with AAVS1'*/TLR cells (Supplementary Fig. 3). 

For suppression of key NHEJ pathway proteins”? by short hairpin (sh) 
RNAs, we added a human H1 promoter to the sgRosa26-1/Cas9/BFP 
expression vector and inserted published shRNA sequences to knock 
down KU70, KU80 or DNA ligase IV. We first determined the extent of 
NHEJ suppression by transfection of AAVS1!*’* cells with different 
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sgRosa26-1-Cas9-BFP-H1 shRNA expression vectors in the absence 
of a repair template. 72 h after transfection the samples were analyzed 
for RFP* cells and compared to controls that had either a scrambled 
shRNA or no shRNA. A substantial suppression of NHEJ repair was 
observed upon the individual or combined knockdown of KU70, KU80 
or DNA ligase IV (Fig. le and Supplementary Fig. 4). Similar results 
were obtained in AAVS1!®/TLR cel] lines by the knockdown of KU70, 
KU80 or DNA ligase IV (Supplementary Fig. 4). The knockdown of 
ligase IV reduced its protein level in transfected AAVS1LR (BFP*) cells 
by 70% (Supplementary Fig. 5). 
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As additional approaches to DNA ligase IV inhibition, we used the 
small-molecule inhibitor SCR7 (ref. 10) or the adenovirus 4 (Ad4) 
E1B55K and E4orf6 proteins, which mediate the ubiquitination and 
proteasomal degradation of DNA ligase IV!!)”. For the coexpression 
of Ad4 proteins, sgRosa26-1, Cas9 and BFP from a vector pair, we linked 
the Ad4 E1B55K or the Ad4 E4orf6 gene by self-cleaving 2A peptide 
sequences to BFP (Fig. 1f). Mono- or biallelic AAVS1!® cell lines were 
transfected with either the sgRosa26-1-Cas9-BFP expression vector 
(in the presence or absence of SCR7 inhibitor) or both sgRosa26-1/ 
Cas9/BFP/Ad4 E1B55K and E4orf6 plasmids. The presence of SCR7 
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reduced the fraction of RFP* cells in the BFP* population fourfold 
whereas the coexpression of Ad4 proteins led to an eightfold reduction, 
compared to controls lacking inhibitor or Ad4 proteins (Fig. 1f and 
Supplementary Fig. 6). The coexpression of Ad4 proteins reduced the 
level of DNA ligase IV protein in transfected AAVS1'® (BEP*) cells by 
93% (Supplementary Fig. 5). Overall, these results show that the NHEJ 
repair of CRISPR-Cas9-induced DSBs can be suppressed by targeting 
DNA ligase IV using RNA interference, SCR7 or, most efficiently, the 
Ad4 proteins. 

To assess the effect of NHEJ suppression on HDR, we transfected 
AAVS1"!®/* cells with Venus repair template together with sgRosa26-1/ 
Cas9/BFP vector (with and without SCR7) or with sgRosa26-1/Cas9/ 
BFP vectors including either the Ad4 proteins or shRNA constructs tar- 
geting KU70, KU80 or DNA ligase IV. After 72 h the frequency of RFP* 
and Venus* cells within BFP* cells was analyzed by FACS (Fig. 2a,b 
and Supplementary Fig. 7). Venust (HDR) cells increased from 5% for 
sgRosa26-1/Cas9/BFP alone, to 8-14% in the presence of single shRNAs 
against KU70, KU80 or DNA ligase IV, to 25% in the presence of shRNAs 
against KU70 and DNA ligase IV or 1 uM of the inhibitor SCR7, and 
further to 36% upon the coexpression of the Ad4 proteins. Thus, HDR 
efficiency was enhanced up to fivefold in the presence of KU70 and 
ligase IV shRNAs or SCR7, and up to sevenfold by the Ad4 protein pair 
(Fig. 2c and Supplementary Fig. 8). 

Titration of SCR7 on AAVS1"!®’* cells showed an optimal effect 
at 1 uM concentration (Supplementary Fig. 9). For cells in the pres- 
ence of two shRNAs, SCR7 or Ad4 proteins, we noticed diminished 
fluorescence signals within the population of Venus* cells at 72 h after 
transfection (Fig. 2a and Supplementary Figs. 7,9c), indicating reduced 
Venus expression in cells undergoing NHEJ blockade, possibly caused 
by local chromatin remodeling through an extended DNA damage 
response!*-!4, However, Venus expression was normal in clones estab- 
lished from AAVS1!!®’* cells targeted in the presence of Ad4 proteins, 
indicating that this effect is only transient (Supplementary Fig. 9d). 
From the sample expressing the Ad4 proteins, Venus* cells were sorted, 
and we established 24 clones to confirm the integrity of the repaired 
TLR loci using PCR and sequence analysis (Supplementary Table 2). 
In contrast to the increase of Venus* cells, RFP* cells decreased from 
3% in the controls to 1.7%, 1.4% or 0.6% in the presence of shRNAs, 
SCR7 or Ad4 proteins, respectively (Fig. 2a,b). Whether the residual 
NHE) activity relies on the KU- and ligase IV-independent alternative 
end-joining mechanism)» remains to be determined. 
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Assessing the influence of the lengths of homology regions of the 
repair template on HDR efficiency, we generated donor templates 
with 3’ homology regions shortened from 1,450 bp as in the original 
donor template to 350 bp (Fig. 2d and Supplementary Fig. 10). We 
transfected AAVS1™®’* cells with sgRosa26-1/Cas9/BFP expression 
vector and each of the various donor templates, with or without the 
coexpression of Ad4 proteins. FACS analysis revealed a reduced tar- 
geting frequency (2%) for the donor with a 350 bp 3’ homology region 
whereas the other molecules showed HDR efficiencies in the range of 
5% (Fig. 2e). In the presence of Ad4 proteins the frequency of Venus* 
cells increased robustly up to 25% for the 350-bp donor and to 30% 
for the other donor molecules. In line with the previous results, the 
frequency of RFP* cells was strongly reduced by the coexpression of 
Ad4 proteins (Fig. 2f). Thus, PCR-generated fragments with combined 
homology regions of >1 kb are effective donors for HDR (Fig. 2e,f and 
Supplementary Fig. 10). 

Applying our approach to an endogenous genomic locus, we 
inserted a GFP reporter gene into the AAVS1 locus of HEK293 cells. 
The AAVS1-SA-T2A-GFP targeting vector includes AAVS1 homology 
regions flanking a splice acceptor site and a 2A peptide sequence linked 
to GFP, enabling reporter expression by the AAVS1-derived transcript 
(Fig. 2g). HEK293 cells were cotransfected with an AAVS1-specific 
sgRNA/Cas9/mCherry (AAVS1-1/Cas9/mCherry) expression plas- 
mid and the AAVS1-SA-T2A-GFP targeting vector, with or without 
coexpression of Ad4 proteins. 72 h after transfection the mCherry* 
transfected population exhibited 8% GFP* cells upon expression of 
sgAAVS1-1/Cas9 alone whereas the coexpression of Ad4 proteins raised 
the frequency of the cells to 66% (Fig. 2h). Thus, in line with the TLR 
results, we observed an eightfold stimulation of gene targeting at the 
AAVS1 locus by the coexpression of Ad4 proteins. FACS sorting and 
cloning GFP* cells 48 h after transfection, we found correct GFP gene 
integration in 45 of 48 samples (95%), but only in 60% of the clones 
derived from the sample without Ad4 proteins (Supplementary Fig. 11 
and Supplementary Table 3), the remaining cells presumably repre- 
senting random integrants. 

Coexpression of Ad4 proteins also promoted HDR efficiency in a 
mouse Burkitt lymphoma (BL) cell line containing an activated PI3 kinase 
a-subunit!* linked to an IRES element and a GFP reporter in the Rosa26 
locus. We targeted the GFP using a specific sgRNA and a promoterless 
donor vector replacing GFP by BFP (Fig. 2i). The fraction of BFP* cells 
indicates HDR efficiency whereas the fraction of GFP™ cells indicates 


Figure 1 Insertion of a traffic light reporter into the AAVS1 locus of HEK293 cells and suppression of the NHEJ pathway. (a) Strategy for insertion of the TLR 
construct into the AAVS1 locus using CRISPR-Cas9 in human HEK293 cells. In the targeted sequence, the AAVS1-specific sgRNA is indicated in blue and 
the PAM signal is shown in red. The pAAVS1-TLR targeting vector includes homology arms (HA) of 800 bp flanking a splice acceptor (SA)-2A-puromycin 
element and the traffic light reporter insert comprising a CAG promoter and a Venus gene inactivated by the replacement of 36 codons with target sequences 
from the mouse Rosa26/Rab38 loci (black insert). (b) Diagram of the TLR system. CRISPR-Cas9-induced DSBs in the target region, repaired by NHEJ 
resulting in deletions that shift translation by 2 bp, led to RFP expression. An intact Venus coding region serves as repair template for HDR, leading to Venus 
expression. (c) Strategy to target the human AAVS17“* locus. Two different protospacers targeting the mouse Rosa26-derived sequence that interrupts the 
Venus gene are indicated in blue, with PAM signals in red. NIH3T3 cells were transfected by empty vectors or vectors expressing sgRosa26-1 or sgRosa26-2/ 
Cas9-T2A-mCherry to test cutting efficiency in the endogenous Rosa26 locus. 48 h after transfection, mCherryt cells were sorted and PCR and T7EI assays 
performed. The percentage of indels was quantified by the ImageJ software. (d) AAVS1'"R cells were co-transfected with linearized Venus repair vector 

and Cas9-BFP expression plasmids with or without (—) sgRosa26-1. 72 h after transfection, flow cytometric analysis of BFP* gated cells displayed 5% of 
Venus* (HDR repair) cells and 3% of RFP* (NHEJ repair) cells. The graphs represent triplicate data from one of three independent experiments with similar 
results, Shown as mean + s.d. (e) Inhibition of the NHEJ pathway using gene silencing. Scheme of the sgRosa26-1/Cas9-2A-BFP-H1shRNA expression 
vector. AAVS1'R cells were transfected with sgRosa26-1/Cas9-2A-BFP (-) or sgRosa26-1/Cas9-2A-BFP-H1-shScrambled, -shKU70, -shLIG4 or -shKU70 

or -shLIG4. 3 days later, these cells were analyzed by flow cytometry, gated on BFP* transfected cells, and the percentage of RFP* cells was determined. 
The graphs represent triplicate data from 1 of 3 independent experiments with similar results, shown as mean + s.d. (f) Suppression of NHEJ repair using 
the ligase IV inhibitor SCR7 or the coexpression of ligase IV degrading Ad4 E1B55K and E4orf6 (Ad4s) proteins. AAVS1'“® cells were transfected with 
sgRosa26-1/Cas9-2A-BFP expression vector (-) or transfected and cultured with SCR7 or were transfected with the sgRosa26-1/Cas9-2A-BFP-Ad4-E1B55K 
or -E4orf6 expression vectors. The samples were analyzed 3 days later by flow cytometry, gating on the BFP* transfected cells, and the percentage of RFP* 
cells was determined. The graphs represent triplicate data from one of six independent experiments with similar results, shown as mean + s.d. Significance 
was calculated using the Student’s ttest: **P< 0.01, ***P< 0.001, ns, not significant. 
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NHEJ-mediated deletion events. Mouse BL cells were electroporated 
with Cas9 and the BFP replacement vector alone or with GFP-specific 
sgRNA/Cas9 expression plasmid with or without the coexpression of 
Ad4 proteins. Electroporation with sgGFP and Cas9 led to a transfection 
efficiency of 40%, as determined by mCherry expression after 24h. About 
half of the transfected cells (22% in total) lost GFP expression after 72 h 
(Supplementary Fig. 12) and 10% of the GFP cells were BFP* (Fig. 2)). 


The addition of Ad4 proteins reduced transfection efficiency to 27%, but 
again about half of the transfected cells (14% in total) lost GFP expres- 
sion after 72 h (Supplementary Fig. 12). Notably, 50% of the GFP" cells 
were now BFP* (Fig. 2j), indicating a fivefold stimulation of HDR by 
Ad4 proteins. 

To assess whether a point mutation can be corrected in a tumor 
cell line, we reverted the T24P codon replacement in the Foxol gene 
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of mouse BL cells, which renders FOXO1 resistant to AKT-dependent 
phosphorylation. We transfected BL cells with a mutation-specific sgRNA 
and a second sgRNA recognizing the first intron, expression vectors for 
Cas9 and fluorescent reporters without or together with Ad4 proteins 
and a targeting vector containing the reverted codon 24 (and harbor- 
ing a silent nucleotide replacement) and a puromycin-resistance gene 
(Supplementary Fig. 13). At day 2 transfected cells were isolated by FACS 
sorting, subjected to puromycin selection, and 53 or 39 clones derived 
from single cells were established from the sample without or with coex- 
pression of Ad4 proteins, respectively. Genotyping of these clones by PCR 
and sequence analysis showed that 43 of the 53 clones derived without 
Ad4 proteins were mutants, 33 of these were heterozygous and 10 clones 
(19%) were targeted on both alleles. In the presence of Ad4 proteins, we 
found that all 39 clones were targeted, 24 of these were heterozygous and 
15 clones (38%) homozygous mutants. Thus, despite the expected high 
targeting rate achieved with a selectable donor vector, the coexpression 
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Figure 2 Enhancement of HDR for CRISPR-Cas9-induced precise gene 
targeting. (a) Improvement of HDR efficiency by suppression of NHEJ key 
molecules. AAVS1'“R cells were cotransfected with linearized Venus repair 
vector and sgRosa26-1/Cas9/BFP expression plasmid together with shRNA 
cassettes, SCR7 inhibitor or the Ad4 E1B55K/E4orf6 proteins. The frequency 
of RFP* and Venus* cells within the transfected BFP* population was 
determined by flow cytometry. The data represent one of four independent 
experiments with similar results. (b) The graph summarizes the frequency 

of RFP* (pink bars, significance compared to the (-) sample) and of Venust 
(green bars; significance compared to the (—) sample) cells determined 

as ina. The y axis is represented as log;, scale. The bars represent mean 
values + S.d. (c) Relative increase of HDR efficiency (significance compared 
to the (—) sample) and of NHEJ suppression normalized to the control 
transfected with sgRosa26-1/Cas9 and targeting vector alone (—). The dotted 
line indicates a more than twofold increase of HDR. The bars represent mean 
values + s.d. (d) Use of PCR-generated donor templates. Scheme of the TLR 
construct and of PCR donor fragments, having a constant 5’ homology region 
of 400 bp whereas the length of the 3’ homology region varies from 350 bp 
to 1,250 bp; as control the linearized promoterless Venus repair vector was 
used. (e) AAVS1™ER cells were cotransfected with sgRosa26-1/Cas9/BFP 
expression vector, the same molar amounts (800 fmol) of PCR donors or 
linearized Venus repair vector without or together with Ad4 E1B55K/E4orf6 
proteins. The percentage of Venus* and RFP* cells was analyzed by flow 
cytometry 3 days after transfection. (f) The graph summarizes the frequency 
of RFP* (purple bars) and Venus* (green bars) cells determined as in e. The 
data represent one of two independent experiments with similar results. The 
bars represent mean values + s.d. (g) Strategy for insertion of a GFP reporter 
gene into the human AAVS1 locus using CRISPR-Cas9 in human HEK293 
cells. The CRISPR-Cas9-targeted site is shown in Figure 1a; in the AAVS1- 
GFP targeting vector the GFP gene is flanked by AAVS1 homology regions of 
800 bp. (h) HEK293 cells were cotransfected with linearized AAVS1-GFP 
targeting vector and with sgAAVS 1-1/Cas9/mCherry expression vector or with 
sgAAVS 1-1/Cas9/mCherry-Ad4-E1B55K and -E4orf6 expression vectors. 

At day 3, the frequency of GFP* cells within the population of mCherry+ 
transfected cells was analyzed using the flow cytometric data; shown is one 
of triplicate samples obtained from one of two independent experiments. 

(i) Fluorescent reporter replacement in the mouse BL cell line. The cell 

line harbors an activated PI3 kinase (P110*)-IRES-GFP-pA cassette in the 
Rosa26 locus. In the targeted GFP sequence, the sgGFP target sequence is 
highlighted in blue and the PAM element in red. DSBs repaired by the NHEJ 
pathway led to the inactivation of GFP. DSBs repaired by HDR with the pBFP 
donor template led to the replacement of GFP by the BFP reporter gene. 

(j) Mouse BL cells were electroporated with Cas9/mCherry or sgGFP/Cas9/ 
mCherry expression vector and the pBFP donor plasmid without or together 
with coexpression of the Ad4 E1B55K/E4orf6 proteins. The frequency of 
GFP BFP (white bars) and GFP-BFP* (blue bars) cells was analyzed at 

day 3. The graph summarizes triplicate results from one of three independent 
experiments with similar results. The bars represent mean values + s.d. 
Significance was calculated using the Student’s t+test: **P< 0.01, 
***P<Q.001, ****P< 0.0001, ns, not significant. 
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of Ad4 proteins further increased the targeting efficiency to 100% and 
doubled the net yield of homozygous, targeted clones. 

In summary, we show that for CRISPR-Cas9-induced mutagenesis 
the suppression of the NHEJ key enzyme DNA ligase IV is an effec- 
tive way for engineering precisely targeted mutations into the genome 
of mammalian cells. The activity of ligase IV can be blocked by gene 
silencing, small-molecule inhibition or proteolytic degradation, offer- 
ing diverse approaches for the optimal delivery into target cells. For the 
proteolytic degradation of DNA ligase IV, we selected the E1B55K and 
E4orf6 proteins of adenovirus 4, shown to exert minimal influence on 
other cellular substrates such as Mre11 or p53, which are co-targeted by 
many other serotypes!”. Nevertheless, we cannot exclude the possibility 
that HDR stimulation by the Ad4 proteins is mediated by the combined 
suppression of DNA ligase IV and other regulatory proteins. It could be of 
further interest to compare the effect on HDR of the E1B55K and E4orf6 
proteins of additional adenoviral serotypes and species identified from 
humans!® and other vertebrates!®. Furthermore, the use of Ad4 proteins 
may be also beneficial for the construction of targeted mouse mutants, 
as recently shown for zygotes cultured in the presence of SCR7 (ref. 20). 
In populations of cells transfected with a sgRNA, Cas9 and Adé4 protein 
expression plasmids, we presently reach knock-in frequencies of 50-66%. 
By delivering Cas9 and sgRNAs as synthetic RNAs it may be possible, as 
shown for human induced pluripotent stemcells”!, to further enhance 
gene targeting efficiencies. It will be interesting to apply CRISPR-Cas9 
mutagenesis combined with NHE) suppression also to early embryos 
of other model organisms and to primary mammalian cells to achieve 
gene corrections. 


METHODS 
Methods and any associated references are available in the 
online version of the paper. 


Note: Any Supplementary Information and Source Data files are available in the 
online version of the paper. 
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ONLINE METHODS 

Traffic light reporter construct. The traffic light reporter (TLR) expression 
construct was assembled by cloning of PCR fragments encoding a defective 
Venus (codons 117-152 replaced by a 52-bp segment derived from the mouse 
Rosa26 locus (sgRNA target sequence underlined) and a 56-bp segment from the 
mouse Rab38 gene) linked to the 2A peptide from Thosea asigna”* and the cod- 
ing region of TagRFP”? in a 2-bp shifted reading frame (+3), cloned in between 
the CAG promoter and the polyA region of the bovine hGH gene. The CAG 
promoter was deleted from this plasmid to derive the traffic light targeting vector 
(Supplementary Table 1). 


Cell culture and reagents. Wild-type, AAVS1™® HEK293 and mouse NIH3T3 
cells were maintained in DMEM (Gibco) supplied with 15% FBS (Gibco), cells 
were passaged three times per week. The mouse Burkitt lymphoma cell line, 
generated from a Burkitt-like mouse lymphoma!® was maintained in DMEM 
supplied with 15% FBS, 2 mM HEPES (Gibco), 2 mM sodium pyruvate (Gibco), 
2mM t-glutamine (Gibco), and 1x NAA (Gibco), beta-mercaptoethanol (Sigma) 
and passaged four times per week. For puromycin selection, mCherry* cells 
were sorted, seeded at 10? cells/well and selected with 3 ug/ml of Puromycin 
for 2 weeks. Then colonies were counted and single cells were sorted. The SCR7 
inhibitor was purchased (Xcess Biosciences, San Diego, USA), 12 h after trans- 
fection these cells were maintained in complete medium supplied with 1 uM 
SCR7 inhibitor until analysis. At SCR7 concentrations of 60 uM and 10 uM, we 
observed a reduction of transfection efficiency and of cell viability. 


Donor vectors and CRISPR-Cas9-T2A-reporter vectors. To generate the 
CRISPR-Cas9-T2A reporter vector, we amplified T2A-mCherry and T2A-BFP 
fragment by overlapping PCR and cloned them into FseI/EcoRI sites of plas- 
mid pX330 (Addgene, #42230). The mCherry and BFP templates were derived 
from the plasmids MSCV-IRES-mCherry and MSCV-IRES-BEP, respectively, a 
kind gift of Frank Rosenbauer and Martin Janz (Charite, Berlin). To generate the 
AAVS1-SA-2A-Puro-TLR targeting vector, a CAG- Venus+1-P2A+3-mtagRFP+3 
cassette was cloned into Sall/Not!I sites of AAVS1-SA-2A-Puro targeting vector 
(Addgene, #22075). In addition, a SA-2A-GFP fragment was generated by over- 
lapping PCR and inserted into the Xhol/Sall sites of the AAVS1 targeting vec- 
tor, with the GFP template derived from a pRosa26-IRES-GFP plasmid (V.T.C., 
unpublished). For reporter replacement in the mouse cell line, the P110*-IRES- 
BEFP-pA construct was generated by overlapping PCR and cloned into the pST- 
blue-1 sequencing plasmid (Novagen). For PCR donor templates, fragments were 
amplified with Herculase II Fusion DNA Polymerase (Agilent Technology) from 
pVenus+1-P2A+3-tagRFP plasmid using the same forward primer and different 
reverse primers (Supplementary Table 4). 


Generation of CRISPR-Cas9 vector expressing shRNA or Ad4 proteins. To generate 
CRISPR-Cas9-T2A-BFP/mCherry-hH1-shRNA vectors, we generated the human 
H1 promoter and MCS sequence by overlapping PCR and cloned it into the NotI 
site of the CRISPR-Cas9-T2A-BFP/mCherry plasmid. To obtain CRISPR-Cas9- 
T2A-BFP/mCherry-P2A-Ad E1B55K or E4orf6 plasmids, the coding regions 
for adenoviral serotype 4 proteins were synthesized as mammalian codon- 
optimized sequences (Supplementary Table 1) by Genscript (Piscataway, NJ; 
USA). Using these genes and BFP and mCherry template plasmids, we amplified 
BEFP/mCherry-P2A-Ad4 E1B55K and BFP/mCherry-P2A-Ad4 E4orf6 fragments 
and cloned them into the NheI/EcoRI sites of the CRISPR-Cas9-T2A-reporter 
plasmid by Gibson assembly (New England Biolabs, E2611S). 


sgRNA and shRNAs. sgRNAs were designed based on unique sequences with 
20 nt and as the last nucleotide before the PAM signal an A or G was selected. The 
target sequences should hybridize with the sgRNA scaffold only at low energy 
as predicted by the Mfold web server (http://mfold.rna.albany.edu/?q=mfold/ 
rna-folding-form). Complementary oligonucleotides were ordered separately, 
annealed, phosphorylated and cloned into the BbsI sites of the CRISPR-Cas9- 
T2A-reporter plasmid (Supplementary Table 4). For shRNA silencing, shRNA- 
targeted sequences were selected from previous reports***° (Supplementary 
Table 4). Complementary oligonucleotides were ordered separately, annealed, 
phosphorylated and cloned into the BamHI/AflII sites of the CRISPR-Cas9-T2A- 
reporter-hH1 plasmid. 


NATURE BIOTECHNOLOGY 


Transfection and electroporation. Human HEK293 and mouse NHI3T3 cells 
were plated into 24-well or 6-well plates at 1 day before transfection. On the day 
of transfection, these cells were supplied with new complete medium and the 
DNA mixed with FUGENE HD Reagent (Promega) in Opti- MEM (Invitrogen) 
according to the manufacturer's introduction. After 15 min of incubation at 
room temperature, the mixture was dropped slowly into the well. For electropor- 
ation, mouse BL cells were harvested and counted, 1-2 x 10° cells resuspended 
with 3 ug plasmid DNA in 100 ml electroporation buffer and transferred to a 
0.2 cm cuvette (Sigma) and electroporated using a Nucleofactor device (Lonza). 
Then, cells were transferred into prewarmed complete medium. 


Cell sorting and flow cytometry. For single-cell cloning, single cells were 
sorted into 96-well plates with 150 ml complete medium supplied with 10 ug/ml 
Gentamycin (Lonza). These plates were briefly centrifuged and incubated at 37 
°C, 5% CO,, the single-cell clones were evaluated 3 days after sorting to exclude 
multiple cell contamination. Cells were cultured until confluence and duplicated 
for genotyping PCR. For the bulk sorting, the reporter-positive cells were sorted 
into 15-ml Falcon tubes with complete medium, cells were centrifuged and fur- 
ther cultured or used for the isolation of genomic DNA. For flow cytometry 
analysis, HEK293 cells were trypsinized and resuspended in PBS/1% BSA FACS 
buffer and analyzed with a Fortessa machine (Becton Dickinson). Mouse BL 
cells were harvested, centrifuged and resuspended in PBS/1% BSA FACS buffer. 


Genomic DNA isolation, PCR and T7EI assay. Reporter* cells were cultured 
and harvested at different time points. Single-cell clones were duplicated in 
96-well plates. Genomic DNA was extracted using the QuickExtract DNA 
extraction kit (Epicentre) following the manufacturer’s instruction. For T7EI 
assay, PCR was done using Herculase II Fusion DNA Polymerase (Agilent 
Technology) with PCR gene-specific primers (Supplementary Table 4) using the 
following conditions: 98 °C for 3 min; 35-37 cycles (95 °C for 20 s, 60 °C for 20 s, 
72 °C for 20 s) and 72 °C for 3 min. PCR products were run on 2% agarose gels, 
purified, denatured, annealed and treated with T7EI (New England Biolabs). 
Cleaved DNA fragments were separated on 2% agarose gels and the DNA con- 
centration of each band was quantified using the ImageJ software. Percent values 
of indels were calculated as described>”. For genotyping PCR, genomic DNA was 
amplified using DreamTaq DNA Polymerase (Thermo Scientific) with primers 
listed in Supplementary Table 4. 


DNA sequencing. PCR products were directly sequenced by specific primers 
or cloned into the pSTBlue-1 Blunt vector (Novagen) following the manufac- 
turer’s protocol. Plasmid DNAs were isolated using the NucleoSpin Plasmid 
(Macherey-Nagel). Plasmids were sequenced using T7 forward primer 
(5'-TAATACGACTCACTATAGGG-3' ) by the Sanger method (LGCgenomics, 
Berlin, Germany). 


Western blot analysis. Transfected AAVS1'® Reporter (BFP*) cells were iso- 
lated by FACS and 10 x 10° cells were lysed on ice in RIPA buffer (20 mM Tris- 
HCI (pH 7.5), 150 mM NaCl 1, mM EDTA, 1% NP-40, 0.1% SDS, 0.1% sodium 
deoxycholate) for 20-30 min in the presence of protease inhibitors (Roche). The 
whole-cell lysates were centrifuged for 10 min at 14,000 r.p.m. The supernatants 
were transferred into new tubes and protein concentrations were determined 
using the BCA protein assay (Bio-Rad). The lysates were boiled at 100 °C for 
5 min and loaded on SDS-PAGE gels. Blots were probed with anti-DNA ligase IV 
(H-300, Santa Cruz Biotechnology) and anti-beta-actin (AC-74, Sigma) antibod- 
ies. Blots were developed with secondary goat anti-rabbit IgG HRP (Southern 
Biotech) or anti-mouse IgG HRP (Southern Biotech) and bands visualized using 
the ECL detection kit (GE Healthcare). 
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A bacteriophage encodes its own CRISPR/Cas 
adaptive response to evade host innate immunity 


Kimberley D. Seed!, David W. Lazinski', Stephen B. Calderwood”? & Andrew Camilli! 


Bacteriophages (or phages) are the most abundant biological entit- 
ies on earth, and are estimated to outnumber their bacterial prey by 
tenfold’. The constant threat of phage predation has led to the 
evolution of a broad range of bacterial immunity mechanisms that 
in turn result in the evolution of diverse phage immune evasion 
strategies, leading to a dynamic co-evolutionary arms race”. 
Although bacterial innate immune mechanisms against phage 
abound, the only documented bacterial adaptive immune system 
is the CRISPR/Cas (clustered regularly interspaced short palin- 
dromic repeats/CRISPR-associated proteins) system, which provides 
sequence-specific protection from invading nucleic acids, including 
phage*"'. Here we show a remarkable turn of events, in which a 
phage-encoded CRISPR/Cas system is used to counteract a phage 
inhibitory chromosomal island of the bacterial host. A successful 
lytic infection by the phage is dependent on sequence identity 
between CRISPR spacers and the target chromosomal island. In 
the absence of such targeting, the phage-encoded CRISPR/Cas sys- 
tem can acquire new spacers to evolve rapidly and ensure effective 
targeting of the chromosomal island to restore phage replication. 

Vibrio cholerae serogroup O1 is the primary causative agent of the 
severe diarrhoeal disease cholera, and lytic V. cholerae phages have 
been implicated in easing disease burden, particularly in the endemic 
region surrounding the Bay of Bengal’*"*. We recently described the 
isolation of the ICP1 (for the International Centre for Diarrhoeal 
Disease Research, Bangladesh cholera phage 1)-related, V. cholerae 
Ol1-specific virulent myoviruses that are omnipresent among cholera 
patient rice-water stool samples collected at the ICDDR,B from 2001 to 
2011 (ref. 14 and present study). V. cholerae readily evolves resistance 
to ICP1 predation through mutations in O1 antigen biosynthetic genes 
outside the human host; however, this mutational escape comes at a 
cost as virulence necessitates maintenance of the O1 antigen’. This 
dynamic between predation by ICP1 and virulence of V. cholerae O1, 
specifically in the context of human infection, provides a unique oppor- 
tunity for discovery of novel bacterial immunity and phage immune 
evasion strategies. One bacterial defensive strategy against phages is 
the CRISPR/Cas system. CRISPR loci consist of an array of short 
direct repeats separated by highly variable spacer sequences of precise 
length corresponding to segments of previously captured foreign 
DNA (protospacers)*””. CRISPR loci are found in ~40% and ~90% 
of sequenced bacterial and archaeal genomes, respectively*'®. The 
CRISPR array is transcribed and the transcript cleaved into small 
CRISPR RNAs (crRNAs) that, in conjunction with the Cas proteins, 
execute an efficient process of immunity in which foreign nucleic acids 
are recognized by hybridization to crRNAs and cleaved*’®. 

We isolated eleven ICP1-related phages from stools of cholera 
patients at the ICDDR,B (ref. 14 and present study), five of which 
encode a CRISPR/Cas system located between open reading frames 
(OREs) 87 and 88 of the ancestral ICP1 genome™. The GC content of 
this CRISPR/Cas system is the same (~37%) as the rest of the ICP1 
genome. The ICP1 CRISPR/Cas system consists of two CRISPR 
loci (designated CR1 and CR2) and six cas genes (Fig. 1a) whose 


organization and protein products are most homologous to Cas 
proteins of the type 1-F (Yersinia pestis) subtype system’? (Sup- 
plementary Table 1). V. cholerae is divided into two biotypes, classical 
and El Tor, the former of which is associated with earlier pandemics 
and has since been replaced by the El Tor biotype’®. The classical strain, 
V. cholerae 0395, has a CRISPR/Cas system belonging to the type I-E 
(Escherichia coli) subtype’’, and to date there has not been any des- 
cription of El Tor strains possessing a CRISPR/Cas system. Thus, 
the origin of the CRISPR/Cas system in ICP1 phage is unknown. 
Protospacer-adjacent motifs (PAMs) are type-specific, short con- 
served sequence motifs in the immediate vicinity of protospacers that 
are required for acquisition and targeting’”'""’. In contrast to the GG 
PAM reported for the type I-F CRISPR/Cas systems in bacteria’, the 
protospacers targeted by the ICP1 CRISPR array have a GA PAM 
(Supplementary Fig. 1). 

The majority of spacers in the ICP1 CRISPR show 100% identity 
to sequences within an 18-kilobase (kb) island found in a subset of 
V. cholerae strains that include the classical strain 0395 isolated in 
India in 1964, El Tor strain MJ-1236 isolated in Bangladesh in 1994, 
and several El Tor strains collected at the ICDDR,B between 2001 and 
2011 (Supplementary Table 2). The 18-kb island resembles the phage- 
inducible chromosomal islands (PICIs) of Gram-positive bacteria, 
including the prototype Staphylococcus aureus pathogenicity islands 
(SaPIs)*°*". SaPIs are induced to excise, circularize and replicate 
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Figure 1 | Genomic organization of the ICP1 CRISPR/Cas system. a, The 
ICP1 phage CRISPR/Cas system consists of six cas genes and two CRISPR loci 
(CRI and CR2). b, For each CRISPR locus, the repeat (28 bp) and spacer (32 bp) 
content is detailed as grey diamonds and coloured rectangles, respectively. 
Repeats (28 bp) that match the repeat consensus are shown in grey diamonds, 
and degenerate repeats are indicated in hatched grey diamonds. An AT-rich 
leader sequence precedes each CRISPR locus (grey rectangle). Spacers are 
coloured according to the percentage identity (solid represent 100% identity, 
gradient represents 81-97% identity). A fifth ICP1-related phage 
(ICP1_2003_A) has a genetically identical CRISPR/Cas system to 
ICP1_2004_A, and has been omitted for simplicity. c, The RNA sequence of the 
CR1 and CR2 consensus repeat with the partially palindromic sequence 
forming the predicted stem in the crRNA underlined. 
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following infection by certain phages. They use varied mechanisms to 
interfere with the phage reproduction cycle to enable their own pro- 
miscuous spread’', and this can protect the surrounding bacterial 
population from further phage predation. The organization of the 
V. cholerae 18-kb island targeted by the ICP1 CRISPR/Cas system is 
similar in length, base composition and organization to that observed 
in the SaPIs subset of PICIs, with an integrase homologue at one end 
and a GC content lower than that of the host species (37% compared to 
47.5%). We therefore refer to the 18-kb element as the V. cholerae 
PICI-like element (PLE) (Fig. 2). 

To address the functional relevance of the ICP1 CRISPR/Cas sys- 
tem, we focused on the interaction between the paired ICP1_2011_A 
phage and the V. cholerae O1 El Tor strain (harbouring PLE1) that 
were isolated from the same stool sample (for simplicity hereafter 
referred to as ICP1 and V. cholerae PLE"). ICP1 has two CRISPR 
spacers (8 and 9) (Fig. 1b) that have 100% identity to sequences within 
the V. cholerae PLE (Fig. 2 and Supplementary Table 2). Using the 
standard soft agar overlay method, we found that ICP1 can plaque 
efficiently on V. cholerae PLE (Fig. 3b). We used northern blot 
analysis to confirm that ICP1 crRNAs are transcribed and processed 
during V. cholerae infection (Supplementary Fig. 2). To test whether 
targeting of the PLE by the ICP1 CRISPR/Cas system affects phage 
fitness, we eliminated spacer 8 and 9 targeting. Spacer 8 targeting was 
disrupted by introducing silent mutations into its target within the 
PLE, generating V. cholerae PLE(8*) (Fig. 3a). We then infected this 
strain with a spontaneous ICP1 spacer 9 deletion mutant, referred to 
as ICP1(AS9). ICP1(AS9) was blocked for plaque formation on 
V. cholerae PLE(8*); however, it maintained wild-type plaquing effi- 
ciency on V. cholerae PLE* (Fig. 3b). Importantly, V. cholerae PLE(8*) 
is sensitive to plaque formation by ICP1 (Fig. 3b), which still harbours 
one spacer (S9) targeting the PLE. These results demonstrate that ICP1 
CRISPR/Cas must target the PLE for destruction in order to effectively 
infect and form plaques, and that a single spacer that targets the PLE 
is sufficient to facilitate successful phage replication. A mutant in 
which PLE ORFs 7-20 were deleted was susceptible to infection by 
ICP1(AS9) with wild-type plaquing efficiency (Supplementary Fig. 3). 
This demonstrates that an intact PLE is required to inhibit ICP1 in 
the absence of CRISPR targeting. These results, in conjunction with the 
observation that PLE] circularizes following ICP1 infection (Sup- 
plementary Fig. 4), further support our designation of the 18-kb island 
as a PICI-like element. 

It has been well documented in the type I-E (E. coli) system that 
CRISPR interference requires an intact PAM and a fully complemen- 
tary seed region (a non-contiguous 7 base pair (bp) sequence imme- 
diately adjacent to the PAM)”. To address the sequence requirements 
of the ICP1 CRISPR/Cas system we constructed a series of point muta- 
tions in the spacer 8 target in V. cholerae PLE that span the PAM, seed 
region and remainder of the target sequence, and determined their 
effect on immunity. In accordance with previous results, we found that 
single mutations within the PAM or the first four positions in the seed 
region immediately adjacent to the PAM abolish ICP1 CRISPR/Cas 
immunity (Supplementary Fig. 5). Interestingly, mutations of increas- 
ing distance from the PAM showed a concordant decreasing effect on 
immunity. Up to five mismatches outside of the seed region of the 
target are known to be tolerated in the type I-E system”, and similarly 
we found that three and five mutations outside of the seed region were 
tolerated; however, eight mutations were not (Supplementary Fig. 5). 
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Figure 3 | Sequence-based targeting by the ICP1 CRISPR/Cas system is 
essential for lytic growth on V. cholerae PLE*. a, Disruption of the V. 
cholerae PLE target protospacer generating V. cholerae PLE(8*). The 32 bp 
protospacer sequence is shaded in grey. b, The sensitivity of each strain (top 
row) to ICP1 or ICP1(AS9) (left column) is shown. Identity between the spacer 
and targeted protospacer is indicated by the red and blue rectangles. The 
efficiency of plaquing (EOP, which is the plaque count on the mutant host 
strain divided by that on the wild-type host strain) is indicated. A dagger 
indicates that the EOP is 10 ° or 10 * depending on the presence of PLE in the 
host strain used for propagation as discussed in the text. 


In experiments where the ICP1 CRISPR/Cas system could not target 
the V. cholerae PLE and therefore plaque formation was greatly 
reduced, we observed phage escape mutants at frequencies that were 
dependent on the host strain on which the phage had been previously 
propagated. When ICP1(AS9) was grown on a PLE* host before pla- 
quing on V. cholerae PLE(8*), the efficiency of plaquing (EOP, which 
is the plaque count on the mutant host strain divided by that on the 
wild-type host strain) was 3.5 X 10 >. The CRISPR loci from ten inde- 
pendent ICP1(AS9) escape mutants were sequenced, and in all cases, a 
new spacer was present at the leader end of the CRISPR CRI array. 
Furthermore, the new spacers had 100% identity to sequences within 
the PLE (Fig. 2), and all newly integrated spacers target the PLE 
with the conserved GA dinucleotide PAM sequence (Supplementary 
Fig. 1b). The experimentally acquired spacers target both the coding 
and noncoding strands (Supplementary Table 3), although most (nine 
out of ten) target the coding strand. The pre-existing spacer (S8) 
(although mismatched in these experiments) also targets the coding 
strand; these data are in support of recent evidence that the DNA 
strand from which new protospacers are incorporated is heavily biased 
towards the existing protospacer orientation**~*. In contrast to when 
phage were propagated on a PLE” host before plaquing on V. cholerae 
PLE(8*), phage escape mutants were detected at a much lower fre- 
quency (EOP = 1.1 X 10 8) when ICP1(AS9) was grown ona V. cho- 
lerae PLE host. This shows that new spacers targeting the PLE are 
incorporated into the CRISPR array during ICP1(AS9) infection of the 
PLE™ host (the immunization process), and that an immune host 
possessing an untargeted PLE can subsequently be used to select for 
new ICP1 CRISPR acquisition events that confer targeting and thus 
restore phage replication. These results demonstrate that the ICP1 


Figure 2 | Genomic organization of PLE1, a representative V. cholerae PLE 
targeted by the CRISPR/Cas system of ICP1-related phages. The integrase 
(int) is in blue, genes encoding hypothetical proteins (with numerical ORF 
designations) are grey. The locations of protospacers incorporated into the 
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CRISPR locus as spacers 8 and 9 (S8 and S9 of ICP1_2011_A) are indicated in 
green above the map. The locations of experimentally acquired protospacers are 
shown below the map in red. 
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CRISPR/Cas system is fully functional as an adaptive immune eva- 
sion system that benefits the phage. 

ICP1 has evolved to effectively target the V. cholerae PLE with an 
adaptive immune evasion system that has never before been shown to 
function in bacterial viruses. During ICP1 infection of V. cholerae 
PLE*, PLE circularizes (Supplementary Fig. 4) and inhibits ICP1 
through an unknown mechanism. To replicate successfully, ICP1 
uses the CRISPR/Cas system to target the PLE for destruction. 
Because host cell death and DNA damage is inherent to lytic phage 
infection, CRISPR-mediated DNA cleavage of the PLE does not affect 
ICP1 infection. Sequencing data has been used to identify putative 
CRISPR arrays within a Clostridium difficile prophage’, and more 
recently in metagenomic data sets of free viruses*®*’. However, there 
is currently no evidence for expression or function of these putative 
arrays. We show that the ICP1-encoded CRISPR/Cas system actively 
and autonomously functions to inhibit host immunity and thereby 
permit lytic infection. This finding, in conjunction with the previous 
observations regarding the presence of CRISPR loci in other phages*”’, 
suggests that the use of the so-called bacterial adaptive immune system 
by these bacterial predators may be an underappreciated immune 
evasion strategy in the unfolding phage versus host co-evolutionary 
arms race. 


METHODS 


Phages (ICP1_2011_A and ICP1_2006_E) and V. cholerae were isolated from 
cholera rice-water stool samples and propagated as described'*'*. Genomic 
libraries were generated for phage and host strains as described’** and sequenced 
using an Illumina HiSeq2000. A V. cholerae O1 El Tor isolate collected at the 
ICDDR,B in 2006, which was sequenced in this study and found to not harbour 
a PLE, was used as the PLE” host for propagation experiments. We used the 
CRISPRFinder program’® to identify CRISPR loci. WebLogo” was used to gene- 
rate sequence logos for identification of the PAM. Point mutations were con- 
structed using splicing by overlap extension (SOE) PCR and introduced using 
pCVD442-lac as previously described’’. The PLE1 deletion construct (missing 
8.6 kb including ORFs 7-20) was constructed using SOE PCR and introduced 
by natural transformation with subsequent deletion of the antibiotic-resistance 
marker using the FLP recombinase method as described*’. ICP1(AS9) was iden- 
tified by screening for alterations in the CRISPR array by PCR following growth on 
V. cholerae PLE‘. RNA was purified using the mirVana kit (Ambion) at the 
indicated times and run on 12% polyacrylamide urea gels. Northern blots were 
pre-hybridized in Ultrahyb-oligo (Ambion) and hybridization was carried out at 
37 °C overnight using 32-nucleotide 5’ end-labelled DNA probes (generated with 
[y-32P]ATP and T4 polynucleotide kinase) complementary to spacers 8 and 6. 
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Photoactivatable CRISPR-Cas9 for optogenetic 


genome editing 


Yuta Nihongaki, Fuun Kawano, Takahiro Nakajima & Moritoshi Sato 


We describe an engineered photoactivatable Cas9 (paCas9) 
that enables optogenetic control of CRISPR-Cas9 genome 
editing in human cells. paCas9 consists of split Cas9 
fragments and photoinducible dimerization domains named 
Magnets. In response to blue light irradiation, paCas9 
expressed in human embryonic kidney 293T cells induces 
targeted genome sequence modifications through both 
nonhomologous end joining and homology-directed repair 
pathways. Genome editing activity can be switched off simply 
by extinguishing the light. We also demonstrate activation 
of paCas9 in spatial patterns determined by the sites of 
irradiation. Optogenetic control of targeted genome editing 
should facilitate improved understanding of complex gene 
networks and could prove useful in biomedical applications. 


The type II bacterial clustered, regularly interspaced, short palindro- 
mic repeats (CRISPR) and the CRISPR-associated protein 9 (Cas9), 
known as CRISPR-Cas9, mediates targeted genome modifications 
that enable dissection of gene and regulatory functions!~*. The 
Streptococcus pyogenes Cas9 nuclease (hereafter referred to as Cas9) 
can bind to and cleave a target DNA sequence that is complementary 
to the first 20 nucleotides of a single-guide RNA (sgRNA) and is 
adjacent to a protospacer-adjacent motif (PAM) of the form NGG. 
Cas9-induced DNA double-strand breaks are repaired by nonho- 
mologous end joining (NHEJ) or homology-directed repair (HDR) 
in mammalian cells, thereby enabling targeted genome editing. 

Methods have been developed to chemically control the nuclease 
activity of Cas9, such as doxycycline-regulated Cas9 expression*°, 
rapamycin-inducible split-Cas9 (ref. 6) and transient delivery of 
purified Cas9:sgRNA complex”~°. These chemical methods have 
been used for generating conditional gene knockouts and reducing 
levels of off-target genome modification. However, some chemicals 
have adverse effects. For example, rapamycin can induce undesirable 
biological effects by perturbing the endogenous mammalian target of 
rapamycin (mTOR) pathway!”. Also, because chemicals diffuse freely 
and are difficult to rapidly remove, such methods cannot be applied 
to achieve spatiotemporal genome editing. 

We set out to design a method of controlling Cas9 nuclease activity 
that is noninvasive and incorporates spatial, temporal and reversible 
control. Light has high spatiotemporal resolution and is noninvasive!!! 
but methods to optically control Cas9 nuclease activity have so far 
been elusive. 


To produce an optically controlled Cas9 we fused two split Cas9 
fragments with photoinducible dimerization domains to generate 
paCas9 (Fig. 1a). In initial experiments to determine the best suited 
split site of Cas9 for high efficiency inducible control, we generated 
various Cas9 fragments fused with the rapamycin-inducible dimeri- 
zation system, FKBP-FRB!3 (Supplementary Fig. 1). We selected 18 
candidate split sites based on an analysis of the crystal structure of Cas9 
in complex with sgRNA!*!>, All candidate split site positions were 
loop regions exposed to solvent. We assessed the rapamycin-induced 
nuclease activity of each split-Cas9 pair using a luciferase-reporter 
plasmid HDR assay (Fig. 1b). In this assay, the cytomegalovirus 
(CMV) promoter-driven luciferase reporter with an in-frame stop 
codon (StopFluc-1) is cleaved by split-Cas9, and then recovers full- 
length luciferase expression through homologous recombination with 
promoter-less luciferase donor vector. Eight combinations of N- and 
C-terminal Cas9 fragments showed significant rapamycin-induced 
reporter upregulation in HEK293T cells (Supplementary Fig. 2). In 
subsequent experiments, we used the N-terminal fragment of Cas9 
(residues 2-713, named N713) and C-terminal fragment of Cas9 
(residues 714-1,368, named C714), which was one of the most 
effective rapamycin-inducible split-Cas9 pairs. 

Next, we fused photoinducible dimerization domains with N713 
and C714 (Fig. 1c). First, we tested the CRY2-CIB1 photoinducible 
dimerization system, which is based on blue light-dependent protein 
interactions between Arabidopsis thaliana cryptochrome 2 (CRY2) 
and its binding partner CIB1 (ref. 16). This system has been widely 
used for optogenetic control of protein-protein interactions in mam- 
malian cells. We generated N713 and C714 fused to the photolyase 
homology region of CRY2 (CRY2PHR) and CIB1 and tested induc- 
tion potency using a luciferase plasmid HDR assay. However, N713 
and C714 fused to the CRY2PHR and CIB1 system did not show 
light-induced Cas9 activity. There are several possible explanations 
for this failure. First, steric hindrance could be caused by CRY2PHR 
(498 amino acids) and CIB1 (335 amino acids), impeding reassembly 
of split-Cas9. Second, oligomerization of CRY2PHR might reduce or 
preclude interactions between the N-terminal and C-terminal frag- 
ments of Cas9 (ref. 17). Keeping these potential problems in mind 
we next focused on a recently developed photoinducible dimeriza- 
tion system named Magnets!*. The Magnet system consists of paired 
photoswitchable proteins, named positive Magnet (pMag) and nega- 
tive Magnet (nMag). Upon blue light irradiation, pMag and nMag 
heterodimerize. Unlike CRY2-CIB1, pMag and nMag (150 amino 
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Figure 1 Design and characterization of photoactivatable Cas9. 

(a) Schematic of the photoactivatable Cas9 (paCas9). Cas9 is split 

into two fragments without nuclease activity, and the Cas9 fragments 

are fused with photoinducible dimerization domains (pMag and nMag). 
Blue light irradiation induces heterodimerization between pMag and nMag, 
which enables split Cas9 fragments to reassociate, thereby reconstituting 
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RNA-guided nuclease activity. Gray, inactive; blue, active. (b) Luciferase reporter plasmid HDR assay. When Cas9 cleaves the CMV-driven luciferase 
reporter with an in-frame stop codon (StopFluc-1), the luciferase reporter is repaired by homologous recombination with promoter-less luciferase donor 
vector and recovers bioluminescence activity. (c) Light-induced reporter activity in HEK293T cells using N713 and C714 fragments of Cas9 fused with 
photoinducible dimerization domains. (d,e) Activities of paCas9-1 and full-length Cas9 targeting StopFluc-1 (d) and StopFluc-2 (e), harboring indicated 
mutations in the PAM. Values are normalized to a positive control, which is a luciferase reporter with canonical PAM (NGG). (f-h) Activities of Cas9 

and paCas9 targeting StopFluc-1 (f), StopFluc-2 (g) and StopFluc-3 (h) with a set of sgRNAs harboring single-nucleotide Watson-Crick transversion 
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acids each) are of a similar size to FKBP (107 amino acids) and FRB 
(93 amino acids). The dynamic range and dissociation kinetics of the 
Magnet system can be tuned by mutating pMag and/or nMag. We used 
pMag and nMagHigh1 (nMag with M135I and M165I mutations)!¥. 
To test whether Magnets could provide effective light-triggered 
reassembly of a split-Cas9, we tested two pairs of fusion proteins; 
N713-nMagHigh1 and pMag-C714 and N713-pMag and nMagHigh1- 
C714. We found that both paired fusion proteins showed substantial 
light-induced Cas9 activity, and N713-pMag and nMagHigh1-C714 
yielded the highest fold-induction (16.4-fold) and lowest background 
activity (Fig. 1c). We used N713-pMag and nMagHigh1-C714 and 
named this construct paCas9-1 (Supplementary Fig. 3). 

To investigate whether paCas9-1 recognizes PAM in the same way 
as full-length Cas9, we generated a set of luciferase reporters, into 
which we inserted stop codons, and which harbored a point mutation 
in NGG PAM (Fig. 1d,e). We tested two different luciferase reporters 
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containing an internal stop codon in different sites (StopFluc-1 and 
StopFluc-2) and confirmed that the Cas9-induced activities of the 
luciferase reporters that had noncanonical PAM were lower than 
that of luciferase reporters having the canonical PAM of the form 
NGG. We found no significant difference in the normalized luci- 
ferase activities of paCas9-1 and full-length Cas9. We also evaluated 
the DNA targeting specificity of paCas9-1 using a luciferase plasmid 
HDR assay (Fig. 1f). We produced a series of sgRNAs for StopFluc-1 
harboring single-nucleotide Watson-Crick transversion mutations 
(Supplementary Table 1). We found no significant difference in the 
DNA targeting specificities of paCas9-1 and full-length Cas9. To further 
investigate DNA targeting specificity of paCas9-1, we performed the 
specificity assay using two different reporters containing an internal 
stop codon in different sites (StopFluc-2 and StopFluc-3) and con- 
firmed that the DNA targeting specificity of paCas9-1 is comparable 
with that of full-length Cas9 (Fig. 1g,h). Consistent with previous 
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Figure 2 Optogenetic genome editing of mammalian endogenous genes by the photoactivatable 
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(d) Light-induced multiplex genome editing by paCas9. HEK293T cells were transfected with paCas9-1 and 

the indicated sgRNAs. (e) Schematic of paCas9-mediated precise genome editing experiments. The red 

arrowhead indicates the putative paCas9 cleavage site. A 96-mer, single-stranded oligodeoxynucleotide (ssODN) 
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studies, sensitivity patterns to single ssRNA-DNA mismatch differed 
among target sequences!°-!, From these experiments, we conclude 
that the PAM requirement and target specificities of paCas9-1 are 
indistinguishable from those of full-length Cas9. 

To show that paCas9-1 could cleave a targeted endogenous genomic 
locus in mammalian cells and induce indel mutation by nonhomol- 
ogous end joining (NHE)J) in a light-dependent fashion, we trans- 
fected HEK293T cells with paCas9-1 and an sgRNA targeting the 
human CCRS locus (Fig. 2a). We quantitatively evaluated the ability 
of paCas9-1 to induce indel mutations in response to light using the 
mismatch-sensitive T7 endonuclease I (T7E1), which cleaves hetero- 
duplexed DNA strands formed by hybridization between mutant 
and wild-type DNA. In the dark, cells transfected with paCas9-1 
targeting CCR5 showed only 1.1% indel rates. However, upon blue 
light irradiation, cells transfected with paCas9-1 targeting CCR5 
exhibited significantly (P = 0.012) higher indel rates (20.5%) in the 
human CCRS locus. The frequency of indel mutations induced by 
paCas9-1 is ~60% of that achievable with full-length Cas9 (paCas9- 
1: 20.5%, full-length Cas9: 34.4%) (Fig. 2a). Using Sanger sequenc- 
ing, we verified that paCas9-1-mediated indel mutations occurred 
in the targeted region of the human CCR3 locus (Fig. 2b). To explore 
the generalizability of optogenetic RNA-guided genome editing with 
paCas9-1, we used sgRNAs for four additional sites in three human 
genes (EMX1, VEGFA and AAVS1). With each of these sgRNAs, we 
observed light-induced indel mutations were observed (Fig. 2c). We 
also carried out a time-course analysis of indel mutations in the EMX1 
locus induced by paCas9-1 (Supplementary Fig. 4). We observed 
that the frequency of indel mutations increased as the blue light 
irradiation time increased. To test whether paCas9-1 could induce 
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indel mutations in different cell lines, we transfected HeLa cells with 
paCas9-1 and an sgRNA targeting human EMX1 (Supplementary 
Fig. 5). Light-induced indel mutations in the EMX1 locus were also 
observed in HeLa cells. We also tested whether paCas9-1 could induce 
indel mutations at multiple target sites (Fig. 2d). Using two sgRNAs 
targeting EMX1 and VEGFA simultaneously, paCas9-1 induced indel 
mutations in both human EMX] and VEGFA loci in response to light. 
These results show that paCas9-1 can be deployed for optogenetic 
multiplexed control of NHEJ-mediated indel mutations in mamma- 
lian cells. 

Next, we investigated whether paCas9-1 could be used for genome 
editing by means of HDR (Fig. 2e,f). We used single-stranded oli- 
godeoxynucleotides (ssODN) as a donor template. We transfected 
HEK293T cells with paCas9-1 targeting EMX1 and ssODN containing 
a HindIII site and analyzed the frequency of HDR in the EMX1 locus 
using a restriction fragment length polymorphism (RFLP) assay. We 
found that paCas9-1 induces HindIII site integration in the human 
EMX1 locus at a frequency of 7.2%. This shows that paCas9-1 can 
induce both random indel mutations and designed genome sequence 
modification through HDR in response to light. 

The off-target activity of Cas9 has been reduced using a paired nick- 
ing strategy based on the Cas9 D10A variant, which nicks targeted 
DNA instead of cutting double strands?*. To explore whether 
paCas9-1 could be converted into a photoactivatable nickase, we 
produced paCas9-1 containing the D10A mutation (paCas9 nickase) 
(Supplementary Fig. 6). Using the paCas9 nickase with an sgRNA 
targeting EMX1 did not induce indel mutations in the human EMX1 
locus. However, using a pair of sgRNAs targeting opposite strands 
of the EMX1 site, our paCas9 nickase induced indel mutations in 


757 


© 2015 Nature America, Inc. All rights reserved. 


& 


LETTERS 


a b a pe Cc Irradiation 
mi Light N.S. EGFP mCherry pattern 
Empty paCas9-1 paCas9-2 > 43 1 
Sf 5 
Blue light: — - + Z + > EB 2 mCherry 
; oS c 
— 5 8 £ 
> = 
oO 
491 bp > = B 0.5 
3 $ = 
353 bp > £ os E 
138 bp » - - 3 EGFP 
0 
Figure 3 Spatiotemporal control of Cas9 nuclease e eyelet 
g P P paCas9 + sgRNA (VEGFA) Irradiation: Light > light Light > dark 


activity with optimized paCas9-2. (a) Comparison of 
paCas9-1 and paCas9-2 by T7E1 assay. paCas9-2 
dramatically reduces background indel mutation in the 
human VEGFA locus while maintaining light-induction 
potency. (b) Quantification of the indel frequencies from 

the relative band intensities of a. Data are represented as 
means +s.d. (nm = 3 from individual experiments). 

Two-sided Welch’s t-test was performed. N.S., not significant. . / 
(c) Spatial activation of paCas9. HEK293T cells were VEGFA = ——— . | bigs. 
transfected with paCas9-2, NHEJ-dependent, EGFP-expressing EMX1 Indel (%): 25.0 7.5 “19.3. ND. 
surrogate reporter and sgRNA targeting the surrogate reporter. 

20 h after transfection, samples were irradiated by slit-patterned blue light using a photomask for 24 h. The width of the slit is 2 mm. Close-up view 
within white square region is also shown. Scale bar, 3 mm (zoom-out view) and 1 mm (close-up view), respectively. (d) Line scan intensity profile of 
EGFP (green) and mCherry (red) in c. (e) Experimental scheme to test whether paCas9 activation is reversible. First, HEK293T cells were transfected with 
paCas9-2 and sgRNA targeting VEGFA. After 20 h, cells were illuminated with blue light for 6 h, then split and incubated in the light or the dark. After 6 h 
incubation, the cells were transfected with an sgRNA targeting EMX1, and incubated again in the light or the dark. 30 h later, genomic DNA was extracted. 
If paCas9 activation is reversible, the cells shifted to the dark state before the second transfection of sgRNA targeting EMX1 should show the indel 
mutation in only the VEGFA locus. (f) A representative gel of the T7E1 assay in e. The indel frequencies of the EWX1 and VEGFA locus are shown at the 
bottom of the gel. (n = 4 from two independent experiments with biological duplicates). Full-length gels are presented in Supplementary Figure 8. 


& | sion T7E1 target: VEGFA EMX1 VEGFA EMX1 
& Splitting rae 


+Light Dark 


2nd transfection sgRNA (EMX1) ¥ a 
+Light | | Dark 


response to light. This indicates that the double-nicking strategy can _ been incubated in the dark just before the second transfection were 
be applied to paCas9 to reduce off-target genome modifications. incubated again in the dark state, whereas the cells incubated in light 
Although paCas9-1 can induce efficient NHEJ-mediated indel _ before the second transfection were incubated again in light. After 30 h 
mutation, paCas9-1 had background activity (about 1-3%) inthedark incubation, the genomic DNA was isolated and analyzed by T7E1 
(Fig. 2a,c). To reduce the background activity of paCas9-1, we replaced _ assay. Cells irradiated with blue light after the first transfection of 
nMagHigh1-C714 with nMag-C714 because the combination of pMag paCas9-2 and with sgRNA targeting VEGFA showed indel mutations 
and nMag shows lower background activity than that of pMag and __in the VEGFA locus, indicating that paCas9-2 was activated by the 
nMagHighl (ref. 18). We transfected HEK293T cells with N713-pMag, _ blue light. After the second transfection with an sgRNA targeting 
nMag-C714 (named paCas9-2) and an sgRNA targeting VEGFA EMX1, cells irradiated continuously with blue light showed indel 
locus, and measured induced indel mutations in the light andthe dark mutations in the EMX1 locus; however, cells shifted to the dark did 
(Fig. 3a). The frequency of indel mutations in the dark induced by _ not have indel mutations in EMX1 locus. This result indicates that 
paCas9-2 was reduced to an undetectable level, as monitored using paCas9-2 can be reversibly activated by blue light. 
the T7E1 assay. Note that the light-induced indel frequency with Several reports have shown that RNA-guided targeting of catalyti- 
paCas9-2 is comparable to that obtained using paCas9-1 (Fig. 3b). cally inactive Cas9 (dCas9) to a specific gene can sterically block RNA 
Therefore, we used paCas9-2 in all subsequent experiments. polymerase and transcript elongation, enabling RNA-guided gene 
Next, we tested whether paCas9 can enable light-induced spatial silencing, which has been named CRISPR interference (CRISPRi)2. 
activation of genome editing (Fig. 3c,d and Supplementary Fig. 7). To further demonstrate the utility of paCas9, we tested photoacti- 
To visualize Cas9-induced NHEJ-mediated indel mutations in living  vatable and reversible control of CRISPRi. We named this method 
cells, we used a surrogate EGFP reporter system that expresses EGFP _ photoactivatable CRISPRi (paCRISPRi). To do this, we generated 
fluorescence when a double-strand break is introduced into the tar- paCas9-2 containing D10A and H840A mutations (padCas9) 
get sequence by Cas9 (refs. 23,24). HEK293T cells transfected with (Fig. 4a). We also designed three sgRNAs targeting different regions 
paCas9-2, surrogate EGFP reporter and sgRNA targeting reporter of the CMV promoter-driven luciferase reporter containing PEST 
were irradiated with slit-patterned blue light. After 24h, a slit pat- (proline-glutamate-serine-threonine rich) and mRNA destabiliz- 
tern of EGFP-expression was observed, showing that paCas9-2 can _ing sequences*®. padCas9 with each sgRNA targeting the luciferase 
spatially control gene editing in response to light. reporter showed light-induced repression of luciferase reporter activity 
We also investigated whether paCas9 activation is reversible (Fig. 4b), demonstrating that our paCas9 platform also provides 
(Fig. 3e,f). We first transfected HEK293T cells with paCas9-2 and optogenetic control of RNA-guided transcription. We also found 
an sgRNA targeting VEGFA, and incubated these samples with blue- _ that the repression efficiency of padCas9 is slightly lower than that 
light irradiation to induce paCas9-2 activation. After 6 h, we split of full-length dCas9 (Fig. 4c). This is consistent with the fact that 
the cells and incubated half the cells in the dark and half the cells the frequency of indel mutation induced by paCas9 is ~60% of 
in light. After 6 h incubation in the dark or light, we performed a __ that achievable with full-length Cas9 (Fig. 2a). Efficiency might 
second transfection of an sgRNA targeting EMX1. The cells thathad be improved by further engineering of padCas9 to, for instance, 
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Figure 4 Optogenetic control of RNA-guided transcription interference 

with padCas9. (a) Schematic of paCRISPRi with paCas9-2 harboring 

D10A H840A mutations (padCas9). In the dark, N713 (D10A)-pMag 

and nMag-C714 (H840A) were inactive. Upon blue light illumination, 

pMag and nMag are heterodimerized, and, consequently, N713 (D10A) 

and C714 (H840A) are reconstituted as functional dCas9, enabling sgRNA- 

guided transcription interference. Gray, inactive; green, active. (b) padCas9 

can repress gene expression in a light-dependent manner. HEK293T cells 

were transfected with N713 (D10A)-pMag, nMag-C714 (H840A), luciferase 
reporter and either indicated sgRNAs targeting luciferase (sgFluc-1, sgFluc-2 

and sgFluc-3) or an unmatched control sgRNA (sgNeg.). 20 h after transfection, 
samples were illuminated by 1.2 W/m2 blue light or kept in the dark for 30 h before 
measuring luciferase bioluminescence. In this experiment, a luciferase reporter containing PEST and mRNA destabilized sequence was used. 

Data are represented as the mean + s.e.m. (7 = 6 from two individual experiments with biological triplicates). Two-sided Welch’s t-test was carried 
out. N.S., not significant; ***P < 0.005 versus the sample in the dark. (c) Comparison of the repression efficiency of padCas9 and full-length dCas9. 
Values are normalized to negative control with an unmatched control sgRNA (sgNeg.). Data are represented as the mean +s.e.m. (n = 6 from two 
individual experiments with biological triplicates). Two-sided Welch’s test was performed. N.S., not significant; *P < 0.05; ***P < 0.005 versus the 
full-length dCas9 in the dark. (d) Time-course of restored luciferase reporter activity after blue light irradiation. HEK293T cells were transfected with 
N713 (D10A)-pMag, nMag-C714 (H840A), luciferase reporter and the indicated sgRNAs. Immediately after transfection, samples were illuminated 
by 1.2 W/m? blue light for 30 h before measuring bioluminescence at time O. After time O measurement, samples were illuminated by 1.2 W/m2 blue 
light (solid lines) or kept in the dark (dotted lines), and bioluminescence was measured every 6 h. Data are shown as mean +s.e.m. (n = 6 from two 
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individual experiments with biological triplicates) normalized to negative control cells (under continuous light irradiation) at time O. 


optimize the protein stability, nuclear localization and dynamic 
range of photoinducible domains. We also tested whether padCas9- 
mediated gene repression could be switched off by halting light 
irradiation (Fig. 4d). We found that the reporter activity recovered 
gradually after we turned off the blue light, showing that CRISPRi 
with padCas9 is reversible. Collectively, our results show that paCas9 
can offer spatiotemporal control of RNA-guided genome editing and 
transcription regulation in a reversible manner. 

A rapamycin-inducible Cas9 designed with a split-Cas9 architec- 
ture that is similar to ours has recently been reported®. However, 
rapamycin-inducible Cas9 does not offer spatial and revers- 
ible activation because rapamycin diffuses freely and is hard to 
remove?’. Furthermore, rapamycin can perturb endogenous mTOR 
signaling pathways!°. 

The spatiotemporal and reversible properties of paCas9 are well 
suited for the dissection of causal gene function in diverse biological 
processes and for medical applications, such as in vivo and ex vivo 
gene therapies. Also, paCas9 has the potential to reduce off-target 
indel frequencies in Cas9-based genome editing. There are several 
studies showing that transient introduction of a Cas9:sgRNA complex 
prepared in vitro can improve the specificity of genome editing”~°. 
Because paCas9 can be switched off by stopping light irradiation, opti- 
cally controlling the duration of paCas9 activation would contribute 
to reducing off-target gene modification. 

This paCas9 platform has the potential to further facilitate CRISPR- 
Cas9 applications. For example, Cas9 applications in vivo have been 
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limited owing to the packaging size limitation of viral vectors. Because 
the cDNA of our paCas9 components is shorter than that of full-length 
Cas9, it will be possible to separately package each paCas9 fragment 
into viral vectors having size limitation, expanding the opportunities 
of in vivo genome editing. Another potential application of paCas9 is 
intersectional control of Cas9 activity, enabling conditional genome 
editing with high precision. Recently, a conditional gene knockout 
strategy has been developed, by combining Cas9 and tissue-specific 
promoter?®, By expressing each paCas9 component from two differ- 
ent tissue-specific promoters, Cas9 activity could be controlled by 
two promoter activities and light, enabling gene modification with 
ultrahigh precision. 

We and other groups have recently reported dCas9-based photoac- 
tivatable transcription systems. These dCas9-based optogenetic 
systems can activate targeted endogenous gene expression. Unlike 
these systems, the paCas9 described in this manuscript enables light- 
inducible targeted genome editing by NHEJ and HDR pathways. In 
addition, we show that paCRISPRi enables optogenetic control of 
targeted gene silencing. 


METHODS 
Methods and any associated references are available in the online 
version of the paper. 


Note: Any Supplementary Information and Source Data files are available in the 


online version of the paper. 
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ONLINE METHODS 

Inducible Cas9 constructions. cDNAs encoding the N- and C-terminals 
fragments of codon-optimized S. pyogenes Cas9 fused with a nuclear localiza- 
tion signal from SV40 were amplified from Addgene plasmid 42230. cDNAs 
encoding FKBP and FRB were amplified from a human cDNA library. cDNA 
encoding CRY2PHR was amplified from Addgene plasmid 26871. The plasmid 
containing CIB1 was obtained from RIKEN BioResource Center (resource 
number: pdal0875). cDNA encoding pMag, nMagHigh1l and nMag was 
prepared as previously described!8. These inducible dimerization domains 
were amplified by standard PCR using primers that add glycine-serine linker 
sequences at the 5’ and 3’ ends. The inducible Cas9 constructs based on the 
N-terminal and C-terminal Cas9 fragments fused with dimerization domains 
were cloned into HindIII/EcoRI and HindIII/Xhol sites of pcCDNA3.1 V5/His-A 
(Invitrogen), respectively. To construct paCas9 nickase and padCas9, we intro- 
duced D10A mutation in N-fragment of Cas9 and H840A in C-fragment of 
Cas9 by multisite-directed mutagenesis kit (MBL) according to manufacturer's 
directions. The full amino acid sequences of the paCas9-1 and paCas9-2 are 
shown in Supplementary Note 1. 


sgRNA constructions. The sgRNAs targeting StopFluc reporters, CCRS, 
EMX1, VEGFA AAVS1 and destabilized luciferase reporter were generated by 
annealed oligo cloning using BbsI site of Addgene plasmid 47108. The target 
sequences and oligonucleotides used for sgRNA construction are shown in 
Supplementary Table 1. 


Reporter constructions. StopFluc reporters for plasmid HDR assay were 
constructed by inserting firefly luciferase sequence amplified from pGL4.31 
vector (Promega) into HindIII and Xhol sites in pcDNA 3.1/V5-HisA and 
introducing stop codons and/or mutant PAM by the multisite-directed muta- 
genesis kit. The site-directed mutagenesis primers used to generate a series of 
StopFluc reporters are shown in Supplementary Table 2. The DNA sequences 
of StopFluc reporters are also shown in Supplementary Note 2. Luciferase 
donor vector was constructed by inserting inverted firefly luciferase sequence 
into XhoI and HindIII in bacteria-expression pColdI vector (Clontech). 
Destabilized luciferase reporter was constructed by inserting firefly luciferase 
with PEST sequence amplified from pGL4.31 vector into KpnI and Xbal sites 
in pcDNA 3.1/V5-HisA, and introducing 5 copies of mRNA-destabilizing 
nonamer sequence” (5’-TTATTTATT-3’) into Xbal and Apal sites by annealed 
oligo cloning. Surrogate EGFP reporter was constructed by inserting mCherry 
and out-of frame EGFP into HindIII and Xhol sites in pcDNA 3.1/V5-HisA, 
and introducing EMX1 target site between mCherry and EGFP using EcoRI 
and BamHI sites by annealed oligo cloning. 


Cell culture. HEK293T and HeLa cells (ATCC) were cultured at 37 °C under 
5% CO, in Dulbecco’s Modified Eagle Medium (DMEM, Sigma-Aldrich) 
supplemented with 10% FBS (HyClone), 100 unit/ml penicillin and 100 
g/ml of streptomycin (GIBCO). Cell lines have not been tested routinely for 
Mycoplasma contamination. 


Luciferase plasmid HDR assay. HEK293T cells were plated at approximately 
2.0 x 104 cells/well in 96-well black-walled plate (Thermo Fisher Scientific), 
and cultured for 24 h at 37 °C in 5% CO). The cells were then transfected with 
Lipofectamine 2000 (Invitrogen) according to the manufacturer’s protocols. 
Plasmid encoding N-fragments of Cas9 fused with dimerization domain, 
C-fragments of Cas9 with dimerization domain, sgRNAs, StopFluc reporter 
and luciferase donor were transfected at a 2.5:2.5:5:1:4 ratio. The total amount 
of DNA was 0.2 g/well. Twenty hours after the transfection, samples were 
incubated at 37 °C in 5% CO, under continuous blue light irradiation or in 
the dark. Blue light irradiation was performed using a 470 nm + 20 nm LED 
light source (CCS Inc.). Intensity of blue light was 1.2 W/m. For chemically- 
inducible reassembly of split-Cas9, the culture medium was replaced with 100 ul 
of DMEM containing 10 nM of rapamycin instead of light irradiation. After 
48 h incubation, the culture medium was replaced with 100 ul of phenol red- 
free DMEM (Sigma-Aldrich) containing 500 uM of D-luciferin (Wako Pure 
Chemical Industries) as a substrate. After 30 min incubation, bioluminescence 
measurements were performed using Centro XS? LB 960 plate-reading lumi- 
nometer (Berthold Technologies). For comparison between DNA recognition 
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ability of paCas9 and full-length Cas9, plasmids encoding full-length Cas9, 
sgRNAs, StopFluc reporter and luciferase donor were transfected at a 5:5:1:4 
ratio. The total amount of DNA was 0.2 [ig/well. After 48 h incubation, the cul- 
ture medium was replaced with D-luciferin-containing phenol red-free DMEM 
and bioluminescence measurements were performed as described above. 


Optogenetic genome editing experiments. For NHEJ-mediated indel muta- 
tion experiments, HEK293T cells were plated at approximately 1.0 x 105 cells/ 
well in 24-well plate (Thermo Fisher Scientific), and cultured for 24 h at 37 °C 
in 5% COp. The cells were then transfected with Lipofectamine 2000 according 
to the manufacturer’s protocols. Plasmids encoding N713-pMag, nMagHigh1- 
C714 and sgRNAs were transfected at a 1:1:1 ratio. As a positive control, 
plasmids encoding full-length Cas9 and sgRNAs were transfected at a 2:1 ratio. 
The total amount of DNA was 0.9 Ug/well. Twenty hours after the transfec- 
tion, samples were incubated at 37 °C in 5% CO, under continuous blue light 
irradiation or in the dark as described above. After 24 h incubation, genomic 
DNA was isolated using Blood Cultured Cell Genomic DNA Extraction Mini 
Kit (Favorgen) according to the manufacturer’s instructions. 

For HDR-mediated genome editing experiments, 6.0 x 10° HEK293T cells 
were nucleofected with 125 ng of N713-pMag, 125 ng of nMagHigh1-C714, 
250 ng of sgRNA targeting EMX1 and 10 uM of single-stranded oligonucle- 
otide donor using the SF Cell line 4D-Nucleofector X Kit S (Lonza) and the 
CA-189 program. Transfected cells were plated at 2.0 x 10° cells/well in 24-well 
plate. Twenty hours after the nucleofection, samples were incubated at 37 °C 
in 5% CO) under continuous blue light irradiation or in the dark. After 48 h 
incubation, genomic DNA was isolated as described above. 

In Figure 3f experiment, cells were plated and cultured as described above 
NHE)J-mediated indel mutation experiments. The cells were then transfected 
with Lipofectamine 3000 (Invitrogen) according to the manufacturer's protocols. 
Plasmids encoding N713-pMag, nMagHigh1-C714 and sgRNAs were transfected 
at a 1:1:1 ratio. The total amount of DNA was 0.5 ug/well. After 20 h incubation 
at 37 °C in 5% CO, in the dark, samples were incubated at 37 °C in 5% CO, under 
continuous blue light irradiation. After 6 h, we split and incubated the cells in 
dark and light state. After incubation for 6 h, we performed second transfection 
of sgRNA targeting EMX1 with Lipofectamine 3000. The DNA amount was 
0.5 g/well. The samples in dark and light state just before the second transfec- 
tion were incubated again in dark and light state, respectively. After incubation 
for 30 h, the genomic DNA was isolated as described above. 


Mismatch-sensitive T7E1 assay for quantifying indel mutation of endog- 
enous genes. The genomic region containing paCas9 target site was PCR- 
amplified using Pyrobest DNA polymerase (TaKaRa) using nested PCR for 
CCR5 and AAVS1 (First PCR: 98 °C, 3 min; (98 °C, 10 s; 55 °C, 30 s; 72 °C, 
1 min) x 20 cycles; 72 °C, 3 min. Second PCR: 98 °C, 3 min; (98 °C, 10 s; 55 °C, 
30 s; 72 °C, 1 min.) x 35 cycles; 72 °C, 3 min), two-step PCR with 5% DMSO 
for EMX1 (98 °C, 3 min; (98 °C, 10 s; 72 °C, 30 s) x 35 cycles; 72 °C, 5 min) or 
touchdown PCR for VEGFA (98 °C, 3 min; (98 °C, 10 s; 72-62 °C, -1 °C/cycle, 
30 s; 72 °C, 30s) x 10 cycles; (98 °C, 10 s; 62 °C, 30 s; 72 °C, 30s) x 25 cycles; 
72 °C, 3 min). The primers for each gene are listed in Supplementary Table 3. 
The PCR amplicons were purified using FastGene Gel/PCR Extraction Kits 
(Nippon Genetics) following the manufacturer’s protocol. Purified PCR prod- 
ucts were mixed with 2 tl of 10 x M buffer for restriction enzyme (TaKaRa) 
and ultrapure water to a final volume of 20 ul, and re-annealed to form het- 
eroduplex DNA (95 °C, 10 min; 90-15 °C, -1 °C/ 1 min). After re-annealing, 
heteroduplexed DNA were treated with 5 units of T7 endonuclease I (New 
England BioLabs) for 30 min at 37 °C and then analyzed by agarose gel elec- 
trophoresis. Gels were stained with GRR-500 (BIO CRAFT) and imaged with 
E-shot II gel imaging system (ATTO). Quantification was based on relative 
band intensities. Following equation is used to calculate the percentage of 
indel mutation by paCas9: 100 x (1 — (1 — (b + c)/(a + b + c))!/?), where a is 
the intensity of the undigested PCR product, and b and c are the intensities of 
each T7E1-digested PCR product. 


Sequence analysis. Purified PCR products used for the T7E1 assay were 
inserted into EcoRV sites in pcDNA3.1/V5-HisA vector. Plasmid DNAs were 
isolated by standard alkaline lysis miniprep, and sequenced using a T7 forward 
primer by the Sanger method. 
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RFLP assay for detecting HDR-mediated modification in endogenous 
human gene. The genomic PCR and purification were performed as described 
above. Purified PCR products were mixed with 30 units of HindIII (TaKaRa), 
2 ul of 10 x M buffer for restriction enzyme and ultrapure water to a final 
volume of 20 ul, and incubated for 30 min at 37 °C. The digested products 
were analyzed by agarose gel electrophoresis. Gel staining and imaging were 
performed as described above. Quantification was based on relative band 
intensities. Following equation is used to calculate the percentage of HDR by 
paCas9: 100 x (b + c)/(a +b +c), where a is the intensity of the undigested PCR 
product, and b and c are the intensities of each HindIII-digested product. 


Spatial surrogate reporter activation. HEK293T cells were plated at 
8.0 x 105 cells/dish on 35 mm dish (Iwaki Glass) coated with fibronectin (BD 
Biosciences), and cultured for 24 h at 37 °C in 5% CO . The cells were then 
transfected with Lipofectamine 2000 according to the manufacturer’s proto- 
cols. Plasmids encoding N713-pMag, nMag-Cas9, sgRNAs targeting EMX1 
and NHE)J-mediated surrogate EGFP reporter containing EMX] targeting site 
were transfected at a 1:1:2:6 ratio. The total amount of DNA was 4.0 ug/dish. 
Twenty hours after the transfection, samples were illuminated by slit-patterned 
blue light using a photomask for 24 h at 37 °C in 5% CO). The width of slit is 
2 mm. Cells were fixed with 4% paraformaldehyde in PBS for 15 min. Images 
were acquired using Axio Zoom.V16 stereo zoom microscope (Zeiss), and 
analyzed using Metamorph (Molecular Devices). 


Photoactivatable CRISPR interference. HEK293T cells were plated at 2.0 x 104 
cells/well in 96-well black-walled plate, and cultured for 24 h at 37 °C in 5% 
CO). The cells were then transfected with Lipofectamine 3000 according to the 
manufacturer's protocols. Plasmids encoding N713 (D10A)-pMag, nMag-Cas9 
(H840A), mRNA-destabilized luciferase-PEST reporter and indicated sgRNAs 
targeting luciferase reporter were transfected at a 2.5:2.5:1:4 ratio. As a full- 
length dCas9 control, plasmids encoding full-length dCas9, mRNA-destabilized 
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luciferase-PEST reporter and indicated sgRNAs were transfected at a 5:1:4 ratio. 
For the sample transfected with triple sgRNAs, the ratio of three sgRNAs was 
1:1:1. The total amount of DNA was 0.1 l1g/well. In Figure 4b,c experiments, 20 h 
after the transfection, samples were incubated at 37 °C in 5% CO, under con- 
tinuous blue light irradiation or in the dark as described above. After 30 h, the 
culture medium was replaced with 100 ul of phenol red-free DMEM containing 
500 uM of D-luciferin. After 1 h incubation, bioluminescence measurements 
were performed. In Figure 4d experiment, samples were incubated at 37 °C in 
5% CO, under continuous blue light immediately after transfection. After 30 h, 
the culture medium was replaced with phenol red-free DMEM containing 
D-luciferin as described above. After 1 h incubation, light-illuminated samples 
were incubated again upon continuous blue light or in the dark, and biolumi- 
nescence measurements were performed at indicated time points. 


Reproducibility and statistics. No sample size estimates were performed, 
and our sample sizes are consistent with that normally used in the genome 
editing and gene regulation experiments. No sample exclusion was carried out. 
No randomization was used. No blinding was used. In Figures 3b and 4b,c, 
variances estimated by F-tests were equal, and two-sided Welch's t-tests were 
performed. We also confirmed the same statistical results by nonparametric 
two-sided Mann-Whitney tests because the normality of data was not tested. 
P values by Mann-Whitney tests are following; In Figure 3b, paCas9-1 and 
paCas9-2 in light showed no significant difference: P = 0.773. In Figure 4b, 
light and dark samples transfected with padCas9 targeting luciferase showed 
significant differences: P = 0.00395. Light and dark samples transfected with 
padCas9 and negative control ssRNA showed no significant difference: 
P=0.749. In Figure 4c, padCas9 and full-length dCas9 transfected with sgFluc- 
1, sgFluc-2 and sgFluc-1+2+3 showed significant differences: P = 0.00395 
(sgFluc-1), P = 0.0065 (sgFluc-2), P = 0.00395 (sgFluc-1+2+3), respectively. 
padCas9 and full-length dCas9 transfected with sgFluc-3 showed no signifi- 
cant difference: P = 0.109. 


doi:10.1038/nbt.3245 
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EditForce 


The challenges faced by EditForce Inc., 
to go beyond genome editing 


ditForce Inc. provides an alternative 

option to genome editing and a novel 

tool for versatile editing of RNA molecules 
at the genomic scale, called ‘transcriptome 
editing, based on pentatricopeptide repeat 
(PPR) protein engineering technologies. Our 
core technologies were invented at Kyushu 
University, Japan. The company was founded 
in May 2015 and is located in Fukuoka city 
in Kyushu, an island in south-west Japan. 
EditForce is an innovative company in the 
post-genomic era. Our mission is to provide 
novel DNA/RNA operating tools to understand 
and modify various living entities, for example 
plants and animals, and to translate our PPR 
technologies in various biological industries, 
including the pharmaceutical and agricultural 
industries. 


New tools lead to a new world 
Recently established genome editing 
technologies will open new avenues for 
biological research and development. EditForce 
Inc. was created to develop and apply DNA/ 
RNA manipulation technologies using our 
core technology, PPR protein engineering, 
which includes PPR-based novel genome 
editing and versatile editing of RNA known 
as “transcriptome editing”. Our technology 
enables sequence-specific manipulation 

of asingle DNA molecule and a single RNA 
molecule in living cells. RNA manipulation 
could provide another layer of manipulation 
in various living entities, such as plants and 
animals. The core technology is based on PPR 
protein engineering, which was developed 
by Professor Nakamura’s group at Kyushu 
University. Professor Nakamura is a founder of 
EditForce Inc. and has studied a plant-specific 
large family of PPR proteins. PPR proteins 
consist of a modular architecture of 35 amino 
acid repeat of PPR motifs, as observed in 
TALE proteins. Nakamura’s group recently 
elucidated the principle behind the DNA/ 
RNA binding mechanism and the recognition 
code’, resulting in the development of new 


applications for engineered DNA and RNA 
binding proteins using DNA-binding and 
RNA-binding PPR proteins, respectively. 
EditForce Inc. was established in May 2015 
through funding by KISCO Ltd (http://www. 
kisco-net.jp), which is a trading and business 
development company that is involved in 
advanced technologies related to chemicals, 
electronics, plastics and biotechnology, and 
most recently started a wide range of support 
for Spiber Inc. (http://www.spiber.jp). EditForce 
Inc’s headquarters and the research facility are 
located in Fukuoka city in Kyushu in south- 
west Japan and the country’s third-largest 
island (Figure 1). The goal of EditForce Inc. is to 
improve PPR-based technologies, to provide 
the research community and bioindustries 
with access to these technologies, in tight 
collaboration with Kyushu University, and to 
build a seamless pipeline from basic sciences 
to applied sciences. In May 2015, EditForce Inc 
started offering innovative molecular tools for 
DNA and RNA manipulations to understand 
and modify living entities and to translate 

PPR technologies for bioindustries including 
medicine, agriculture and bioproduction. 


EditForce Inc. 
at Fukuoka, 


Figure 1 | EditForce Inc. is located in Fukuoka 
city, Japan. EditForce Inc. produces new 
genome/transcriptome editing tools based on 
pentatricopeptide repeat (PPR) proteins, which 
belong to a plant-specific RNA/DNA binding 
protein family. 
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Figure 2 | PPR proteins. (a) Schematic representation of PPR-RNA interaction. A single PPR motif 


recognizes a single specific base that is defined by the combination of amino acids at three positions 
(1, 4, and ii, highlighted in red). (b) Tertiary structure of PPR array (Protein Data Bank; 4M59). 


(c) Examples of PPR codes for RNA binding. 


Demands for new DNA/RNA 
manipulation tools in the post- 
genomic era 

Modern molecular biology proposes a central 
dogma that explains the flow of genetic 
information from DNA to RNA to protein. The 
universality of the central dogma indicates 
that all truth and enigma of biological systems 
are hidden in the genome sequence. Strong 
interest in revealing genetic information 

has driven whole genome sequencing and 
advances in DNA sequencing technologies. 
The development of high-throughput 
sequencing methods, called next generation 
sequencing (NGS), resulted in the acquisition 
of information from large DNA sequences 

at once. For example, the human genome, 
which is composed of three billion base pairs, 
can be sequenced within one day using this 
technique. NGS provided a huge amount of 
genome sequence information for numerous 
living organisms, such as animals, fish, plants 
and insects. NGS has also been used to identify 
minor differences within genome sequences 
among individuals, which may be involved 

in susceptibility to disease and pathogens or 
sensitivity to chemicals. In our post-genomic 
era, suitable molecular tools are in demand 
to understand and utilise this vast amount of 
information. Recently established genome 
editing technologies are DNA manipulation 
tools that enable the modification of a single 


genetic locus within the genome in a living cell. 


Despite the central dogma principle, 
important variations from DNA occur in RNA, 
especially in eukaryotic cells. This includes 
a series of processing steps, including 
RNA splicing, control of localisation, and 
translation efficiency that determine how 


much and when a protein is produced. Thus, 
RNA provides diverse and crucial functions 


allowing eukaryotic cells to read their genomes. 


Moreover, the genome is largely occupied 

by ‘junk DNA’ in the animal genome. For 
example, the proportion of protein coding 
genes represents only 1.5% of the human 
genome. However, simultaneous analysis of 
the transcriptome (gene expression studies) 
revealed that a large part (more than 60%) of 
the animal genome is transcribed into RNA. A 
huge amount and a variety of non-coding RNAs 
have been identified, which are transcribed, 
but not translated into proteins. The functions 
of most non-coding RNAs still remain unclear. 
Protein coding genes have been found to 

be more complex than originally thought. 
Through alternative splicing, various products 
can be produced from a single genomic 

locus. One challenge in biology during this 
century is to understand how DNA/RNA 
information is utilised by living cells and how 
we can effectively use this information. Our 
PPR-based RNA manipulation technique has 
promising potential to be a novel versatile RNA 
manipulation tool in response to this demand. 


PPR proteins, a priority of 
EditForce’s technology 

PPR protein is the abbreviation for 
pentatricopeptide repeat (PPR) motif- 
containing protein. The ‘Penta-Trico-Peptide’ 
denotes 35 amino acids. A PPR protein 
typically consists of a repeating unit of 35 
amino acids of PPR arrays (Figure 2). The PPR 
motif and PPR protein were first recognised 
in 2000, by informatics studies during the 
genome sequencing project of a model 
plant, Arabidopsis thaliana. The PPR protein 


genes were identified as a gene family that 

is only found in land plants (500 variations 

of PPR genes/plant)’. A series of analyses 
demonstrated that PPR proteins are involved in 
various gene expression events in a sequence- 
specific manner. Typically, a single PPR protein 
targets a single RNA or DNA molecule and 

is required for RNA stability, RNA splicing, 
processing, RNA editing, or transcription of 
chloroplast or mitochondrial genes’. In plants, 
most PPR proteins (90%) seem to be target 
RNA molecules, therefore functioning as 
RNA-binding PPR or canonical PPR proteins, but 
the remaining PPR proteins act as DNA-binding 
PPR proteins. 

Our interest was to determine how PPR 
proteins act as sequence-specific RNA-binding 
proteins. PPR proteins contain tandem arrays 
of 2-30 PPR repeats that fold into a pair of 
anti-parallel a helices (Figure 2). The PPR motif 
array forms a super-helical binding surface 
similar to that observed in the transcription 
activator-like effector (TALE) protein. The 
principle of PPR-RNA recognition has been 
elucidated by focusing on the relationship 
between the motifs and binding sequences. 
One PPR motif corresponds to one nucleotide 
and the combination of amino acids at three 
particular positions (residues 1, 4, and ii) defines 
the nucleotide specificity in a programmable 
manner' (Figure 2b and 2c). This knowledge 
provides the rationale for the engineering 
of custom RNA binding proteins, which 
correspond to arbitrary RNA sequences. 

Furthermore, we found that the PPR-RNA 
recognition principle could be applied to 
DNA-binding PPR proteins, by collaborating 
with Prof. Yamamoto's group at Hiroshima 
University. Thus, we have established 
engineering platforms for both custom DNA 
and RNA binding proteins. EditForce Inc’s first 
challenge was to bring these new PPR-based 
tools outside of the plant organelles. 


DNA-binding PPR as a new 
genome-editing tool 

Genome modification techniques have been 
used by humans for millennia and have been 
beneficial. A good example is plant breeding in 
agriculture over the past 6,000 years. Rational 
DNA manipulation began with the discovery 
of restriction enzymes in the 1960s. Recently 
established genome editing technologies 
using zinc finger, TALE, or CRISPR have enabled 
rational modification of specific gene(s) in 
living cells*? (Box 1). Our DNA-binding PPR- 
based technology is protein-based and can be 
used as TALE proteins, due to the PPR modular 
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structure with a definite DNA-recognition 

code in a motif context independent manner 
(Figure 3). PPR proteins present several 
advantages. The amino acid sequence of 
natural PPR proteins is less conserved and 
permits PCR-based amplification or chemical 
DNA synthesis and PPR proteins are free during 
the selection of the target DNA sequence, in 
contrast to the requirement for a 5’-T residue 
for TALE or the PAM sequence of CRISPR/Cas. 
Improving the activity of DNA-binding PPR and 
the useful construction method are the two 
major tasks to be tackled. 


RNA-binding PPR as a novel 
versatile RNA manipulation tool for 
transcriptome editing 
In contrast to the progress in DNA manipulation 
tools for genome editing, equivalent RNA 
manipulation tools that can operate ona 
single specific RNA in a living cell are still in 
their infancy. Available RNA manipulation 
tools can be divided into two classes, RNA- 
based or protein-based tools. Currently, the 
most prominent RNA manipulation tools are 
the guide RNA-based siRNAs used for RNA 
interference, which enables sequence-specific 
transcriptional or post-transcriptional gene 
silencing and is available in most eukaryotes 
by introducing double-stranded RNAs. 
This silencing technology has been widely 
applied in research. However, the therapeutic 
use of nucleic acid-based technologies, 
including RNAi, has been hampered due to 
the instability of RNAs in vivo, absence of 
suitable delivery systems, unwanted innate 
immune responses, and off-target side effects®. 
Moreover, these technologies are limited to 
‘knockdown’ applications, and cannot be 
used to switch on the expression of an mRNA 
or to alter the function of an encoded gene 
product. Consequently, the development 
of alternative, protein-based technologies 
would be highly desirable. Several efforts 
to alter gene expression by targeting RNA 
with modified RNA binding proteins (RBPs) 
have been attempted by various scientists. 
However, the proteins were unable to operate 
on endogenous mRNAs because of the 
unavailability of RBPs with arbitrary specificity. 
Pumilio/fem-3 (PUF) protein appeared 
suitable for the construction of designer RBP’. 
PUF proteins typically consist of eight tandem 
repeats of a 36 amino acid unit. The structure 
of PUF—-RNA complexes presents a simple ‘one 
repeat to one nucleotide’ correspondence, with 
RNA-contact defined by two amino acid resi- 
dues at specific positions. Several investigators 
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Figure 3 | Characteristics of available DNA/RNA manipulation tools. The custom DNA-binding 
PPR is similar to transcription activator-like effector (TALE). RNA-binding PPR is a protein-based, 


unique technology, enabling the engineering of arbitrary sequences of any length. The number in 


parentheses indicates the target sequence length. 
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Figure 4 | Possible DNA/RNA manipulation tools based on PPR proteins. Various DNA/RNA 
manipulation tools can be designed by fusion of sequence-specific DNA/RNA binding modules 


and effector domains with various functions. EditForce Inc. provides a wide variety of DNA and 
RNA manipulation tools using PPR proteins. Zinc finger (ZF), TALE, or dCas (inactivated form of 


clustered regularly interspaced short palindromic repeat-associated (Cas)) are also applicable for DNA 


manipulation tool engineering. 


successfully engineered PUF proteins to recog- 
nise new target sequences. Custom PUF repeat 
proteins have been conjugated with various 
effector domains for RNA cleavage, RNA visu- 
alisation, control of translation and alternative 
splicing. Therefore, these proteins are appeal- 
ing as useful protein-based tools’. However, the 
flexibility of the binding sequence alteration 
was unexpectedly restricted due to the context 
dependency of PUF as observed for the DNA- 
binding zinc finger module. Additionally, PUF 
is responsible for only eight to nine nucleotide 
sequences, suggesting that its versatile use is 
significantly impaired. 

The RNA-binding PPR motif has emerged as 
being exceptionally promising for the rational 
design of sequence-specific RBPs. PPR proteins 
bind nucleic acid with a 1:1 ratio between 
protein motifs and RNA bases, with a definite 


RNA recognition code, determining a couple 
of key sites in each motif’. Natural PPR proteins 
consist of 2-30 repetitions of PPR motifs, 
suggesting that they might be applicable to 
target sequences of various length? (Figure 3). 
Moreover, their huge natural diversity in plants 
provides plenty of resources for engineering. 
The PPR-based versatile RNA manipulation 
tool could provide a novel transcriptome 
editing technique via the design of new RNA 
manipulation tools. 


Potential of DNA/RNA 
manipulation tools 

Recently designed genome-editing tools using 
zinc finger, TALE and CRISPR, which enable 

the rational deletion or insertion of aDNA 
sequence in vivo, open a new avenue in biology. 
Although genome editing mainly uses the 
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Box 1 | The history of gene modification techniques 

Plant breeding is an ancient gene modification technique that is performed by crossing two 
strains with different genetic backgrounds to produce new crops and vegetables with desirable 
characteristics such as high and stable yield, good taste or beautiful petal colours. The breeding 
processes still remain time- and labour-consuming as these processes largely rely on the 
occurrence of random DNA alterations. 

Restriction enzymes are protein-based DNA scissors, discovered in the 1960s. However, their 
short recognition sequence (typically a 6 base pair recognition site) has restricted their application 
at the genomic scale. The question remains: What DNA sequence length is required to select a 
single specific locus from the human genome, which is three billion base pairs long? This can be 
theoretically estimated to be 16 base pairs, because DNA presents four variations of A, T, C, and 
G (1/4'°, approximately one site in four billion base pairs). In an attempt to solve this problem, 
several approaches were developed using a meganuclease that cleaves a 14-44 base pair-specific 
DNA sequence. However, alteration of the sequence specificity is severely restricted. 

The first DNA scissors with arbitrary sequence recognition technology was the zinc finger- 
based technology’. A single zinc finger module recognises three nucleotides, and the repeating 
architecture of the zinc finger protein can be used to combine the zinc finger modules to obtain 
specific sequence recognition with a comfortable length, for example the 6 repetitions correspond 
to an 18-nucleotide DNA element, recognizing a single genomic locus within the genome. 
However, the zinc finger module originally prefers the ‘GNN’ sequence and its specificity is altered 
by the adjoining zinc finger module (motif-context dependency). Therefore, it was difficult to 
design and construct a zinc finger presenting the intended DNA-binding capacity (Figure 3). 

These limitations have been largely solved by TALE-based technologies*. TALE presents a 
modular structure of 34 amino acid unit repeats. A single TALE unit corresponds to a single 
nucleotide and the two positions of amino acids determine the recognizing DNA base in a 
programmable and motif-context-independent manner. Since TALE enabled the rational design 
of the molecular scissors, genome-editing technology was applied to various organisms, 
including animals, plants, fish and bacteria. Limitations of the TALE technology include the use 
of a peculiar cloning method of Golden Gate assembly due to the high amino acid conservation 
and the requirement of a thymine (T) residue at the 5’-end of the target DNA sequence, due to the 
presence of the un-exchangeable N-terminus region. 

CRSIPR/Cas (clustered regularly interspaced short palindromic repeat/CRISPR associated) then 
drastically improved genome-editing techniques’. This technique uses a guide-RNA based system 
that enables fast and cost-effective genome editing by just designing an oligonucleotide of 20 
nucleotides that is complementary to the target DNA sequence. This system is also applicable for 
multiple gene modifications at once. Its simplicity and power overcome the limitation that a PAM 
sequence (-NGG) at the 3’-end of the target sequence is required and suspicions regarding its high 


off-targeting rate. 


fusion of the DNA-binding module to a DNA 
scissor (nuclease), the potential of DNA-binding 
modules is huge, because these DNA tools 

can be easily modified to design different DNA 
manipulation tools by exchanging the nuclease 
domain with other functional domains such as 
transcription repressor/activator, epigenetic 
regulator, or fluorescent protein. Equally with 
DNA manipulation tools, the PPR-based RNA 
binding module can be connected to various 
effector domains including ribonuclease, 
translational activation or repression domain, 
localisation domain, or fluorescent protein to 
visualise endogenous RNA (Figure 4). Thus, PPR- 
based RNA manipulation tools provide novel 
opportunities to control endogenous RNAs. 
These DNA/RNA manipulation tools should 


considerably facilitate genome structure 

and function studies and related industrial 
applications. Moreover, combining the use of 
multiple DNA/RNA manipulation tools would 
help to create more elaborate and effective 
controls of the biological process of interest. 


EditForce vision of PPR engineering 
EditForce focuses on research and 
development of molecular tools operating 
both on DNA and RNA, based on PPR 
technology (Figure 5). The potential would 
drive innovations in various biological research 
fields and industries. 

For basic sciences: The human genome 
sequence was determined in 2003. The number 
of protein-coding genes is approximately 


25,000, occupying 1.5% of the total genome 
sequence. The number is unexpectedly low 
and almost equivalent to that in mouse and 
plants, indicating that the number of protein- 
coding genes does not explain the complexity 
and identity of living organisms. A parallel 
comprehensive study of the transcriptome 
discovered an unimagined complex, novel 
feature of the new RNA continent, including 
a huge amount of non-coding RNAs and 
extensive alternative RNA splicing, especially 
in mammals. Therefore, RNA is an important 
key factor to understand the complexity and 
identity of human beings. Editing tools for 
DNA and RNA can be applied to investigate 
the genome complexity and the new RNA 
continent. 

Furthermore, an emerging research 
area in synthetic biology has received 
special attention. Synthetic biology is an 
interdisciplinary area developed to design 
and construct biological devices or biological 
systems. Together with the recent advances 
in the chemical synthesis of long DNA 
sequences, the field of ‘genome design’, 
involving more drastic modifications of the 
genome sequence or the chemical build-up 
of the whole genome sequence, could be a 
realistic technology. Genome editing tools, 
including zinc finger, TALE, CRISPR/Cas, and 
DNA-binding-PPR, facilitate the design of 
genome and transcription circuits that initiate 
gene expression. Furthermore, our PPR-based 
RNA manipulation tools would add operation 
systems at the RNA level. Combining DNA 
and RNA operation tools will lead to a new 
technology for ‘genome programming’ Various 
DNA/RNA manipulation tools will open the 
door of genome design/programming beyond 
genome editing. 

For medicine: Genome editing has been 
anticipated in the medical field, especially 
for reproduction and gene therapy. Several 
attempts have already started using TALE and 
CRISPR, including chimeric antigen receptor 
(CAR) T cell immunotherapy and the generation 
of disease-relevant cells using pluripotent 
stem cells. We are also focusing on these 
relevant areas. Moreover, our protein-based 
RNA manipulation tools using RNA-binding 
PPR can be applied to acquired diseases, which 
many persons may be suffering from, unlike 
genetic diseases. The discovery of RNAi resulted 
in a strong driving force, establishing RNA- 
targeted pharmacology. However, siRNA-based 
pharmacology has been hampered by the 
difficulties described earlier (off-target effects, 
absence of the appropriate delivery systems, 
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and unstable traits due to the chemical nature 
of RNA). The PPR-based technology is protein- 
based and could therefore overcome many 
disadvantages of RNA-based tools. EditForce 
Inc. has established a translation enhancement 
tool for specific RNA, which may be applied 
to reinforce the activity of the tumour 
suppressor gene, p53, for cancer therapeutic 
purposes, for example. The development of 
various RNA manipulation tools for up- and 
down-regulation, splicing, and editing (RNA 
modification) are in progress. These tools can 
be utilised for novel therapeutic use, not merely 
for the suppression of gene expression as for 
siRNA, but also for the up-regulation or change 
in the coding capacity without affecting the 
patient’s genome. Moreover, we succeeded in 
designing a custom RNA-binding PPR molecule 
that specifically recognises a human infectious 
RNA virus. This RNA-binding PPR molecule will 
be used for detection and therapy in the near 
future. 

For agriculture: genome editing can also 
be used in agriculture. In fact, genome editing 
is listed as one of the new plant breeding 
techniques, enabling fast and precise genetic 
modifications without introducing foreign 
genes in the final product’. Therefore, genome- 
editing crops could be out of the current legal 
regulation based on the Cartagena protocol 
on Biosafety. Although regulation of genome- 
editing plants is currently under debate, 
PPR-based genome editing is also applicable 
to agricultural breeding. PPR research and 
development for agricultural purposes has 
been in part supported by the Cross-ministerial 
Strategic Innovation Promotion Program (SIP) 
of the Council for Science, Technology and 
Innovation (CSTI) in Japan. Furthermore, we 
have developed a F1 hybrid seed product 
using our RNA manipulation technology. The 
F1 hybrid seed production aimed at adding 
hybrid vigour, also called heterosis, in various 
cultivars. In the F1 hybrid seed production, 
cytoplasmic male sterility (CMS) is used as an 
important agricultural trait'®. The CMS trait 
is caused by the expression of an aberrant 
protein-coding gene in the mitochondrial 
genome, but the presence of a nuclear gene of 
restorer-of-fertility (Rf) cancels the sterility by 
post-transcriptional repression of the aberrant 
mitochondrial gene. The CMS-Rf system has 
been used in various cultivars, including those 
of maize, canola, and rice, for along time. 
Various Rf genes have been identified and, 
in most cases, assigned as RNA-binding PPR 
protein genes. Elucidation of the PPR-RNA 
recognition mechanism led to the design of 
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Figure 5 | EditForce Inc. vision is to use PPR-based DNA and RNA manipulation technologies to edit 


the genome and transcriptome in living cells. We provide genome and transcriptome editing tools for 


various fields of biological sciences, including basic science, medicine, agriculture, and bio-production. 


artificial Rf genes that enable more efficient 
use of the CMS-Rf system or the application 

of the CMS-Rf system to crops in which this 
system has not yet been applied. Mitochondrial 
genome editing is also one of our research 

and development interests for the purpose of 
designing artificial CMS genes and developing 
pairs of artificial Rf and CMS genes for a wide 
range of crops. 

For bio-production: The production of 
various materials, special proteins or chemical 
compounds for a wide range of applications 
such as biofuels and therapeutic uses has 
been utilised in living organisms, including 
bacteria, cultured cells and plants. Homologous 
systems in which the compound of interest is 
uniquely produced by the organism itself or 
by heterologous expression systems in which 
foreign protein(s) are expressed using the 
appropriate host organism as a bioreactor are 
utilised. Genome editing technologies enable 
the application of these technologies to various 
organisms and should facilitate the engineering 
of host organisms to improve bio-production. 
EditForce Inc’s technologies, editing of both 
DNA and RNA, enable the optimisation of 
the genome (DNA) and the expression circuit 
(DNA and RNA). The PPR-based translational 
activation tools, described above, have been 
used for the production of a useful protein 
of therapeutic interest. The development of 
another molecular tool operating in subcellular 
localisation, and nuclear-cytoplasmic transport, 
is ongoing. These RNA operation tools would 
open other opportunities for cell engineering 
in bio-production. 


In conclusion, the development of genome 
editing tools should allow us to better 
understand and effectively use the huge 
amount of genetic information currently 
available. EditForce provides another 
option for genome editing tools anda 


new technology for transcriptome editing. 
Combining the use of DNA and RNA 
manipulation tools will open a new field of 
genome design/programming. EditForce Inc. 
provides genome design/programming tools, 
which will lead to a better understanding of 
biology and can be applied in a wide range 
of industries. 


REFERENCES 

1. Yagi, Y. etal. Elucidation of the RNA 
recognition code for pentatricopeptide 
repeat proteins involved in organelle RNA 
editing in plants. PLoS One 8, e57286 (2013). 

2. Small, |.D. & Peeters, N. The PPR motif — a TPR- 
related motif prevalent in plant organellar 
proteins. Trends Biochem Sci. 25, 46-47 (2000). 

3. Nakamura, T. et al. Mechanistic insight 
into pentatricopeptide repeat proteins as 
sequence-specific RNA-binding proteins for 
organellar RNAs in plants. Plant Cell Physiol. 
53, 1171-1179 (2012). 

4. Perez-Pinera, P. et al. Advances in targeted 
genome editing. Curr Op Chem Biol. 16, 
268-277 (2012). 

5. Doudna, J.A. & Charpentier, E.Genome 
editing. The new frontier of genome 
engineering with CRISPR-Cas9. Science 346, 
1258096 (2014). 

6. Davidson, B.L. & McCray, P.B. Jr. Current 
prospects for RNA interference-based 
therapies. Nature Rev Genet. 12, 329-340 
(2011). 

7. Wang, Y. et al. Engineered proteins with 
Pumilio/fem-3 mRNA binding factor scaffold 
to manipulate RNA metabolism. FEBS J. 280, 
3755-3767 (2013). 

8. Yagi, Y. etal. The potential for manipulating 
RNA with pentatricopeptide repeat proteins. 
Plant J. 78, 772-82 (2014). 

9. Lusser, M. & Cerezo, EM. Comparative 
regulatory approaches for new plant 
breeding techniques, JRC Scientific and 
Technical Reports. (http://ftp.jrc.es/EURdoc/ 
JRC68986.pdf) (2012). 

10. Chase, C.D. Cytoplasmic male sterility: a 
window to the world of plant mitochondrial- 
nuclear interactions. Trends Genet. 23, 81-90 
(2007). 


SPONSOR RETAINS SOLE RESPONSIBILITY FOR CONTENT 


+ Want to bring 
your technology 


Ut Of the fab? 


(he value chain. 


KISCO specializes in the commercialization of advanced materials 
that will build an improved future. Weare currently working with EditForce Inc. 
to realize the practical potential of next-generation, 
genome editing tools. 
Please contact us at: 


Tokyo Head Office 
KA 4-11-2 Nihonbashi Honcho, Chuo-ku, Tokyo, JAPAN 103-8410 
Life Science Company TEL: +81-3-3663-0274 


3 December 2015 
nature.com/diagnostics-modelling 


Infectious disease control and elimination 
Modelling the impact of improved diagnostics 


SUN Se SHEE DIAGNOSTICS hpg 


ATES foundation “HEE MODELLING ¢ CONSORTIUM narure publishing group 


teen 


nature.com/diagnostics-modelling 


INTRODUCTION (@)ia)\ 


Expanding the role of diagnostic and 
prognostic tools for infectious diseases in 


resource-poor settings 


Azra C. Ghani', Deborah Hay Burgess?, Alison Reynolds! & Christine Rousseau 


Nature 528, S50-S52 (3 December 2015), DOI: 10.1038/naturel6038 


This article has not been written or reviewed by Nature editors. Nature accepts no responsibility for the accuracy of the information provided. 


he life-saving impact of new diagnostic and prognostic technologies 

that aim to reduce the burden of infectious diseases is often not well 

understood. Although the potential benefits of other interventions 
such as drugs and vaccines can be estimated by simply counting the numbers 
treated and multiplying that by the effect size of the intervention, understand- 
ing the role that diagnostics can have requires more complex analyses. As 
for other interventions, the performance of the tool is important. Few, if any, 
diagnostic tools have 100% sensitivity and specificity or a perfect quantita- 
tive range. However, unlike drugs or vaccines, the impact of the diagnostic de- 
pends on the actions taken after the diagnostic or prognostic test result. First, 
the tests may not always be run or interpreted correctly because they are of- 
ten used by staff with minimal training. This may further reduce performance, 
depending on the level of the health-care setting in which the test is used. This 
has been clearly demonstrated by a South African study’ in which several HIV 
rapid diagnostic test procedures were observed and only 3.4% were found to 
have been performed in full compliance with procedure, suggesting that there 
is a potential for high rates of misdiagnoses. Second, the clinician or health- 
care worker must interpret the results and make the appropriate clinical deci- 
sion. In the case of Cepheid Gene Xpert and malaria rapid diagnostic tests, for 
example, studies have shown that even when the test gives the correct result, 
treatment is often provided empirically”. Third, the clinical decision needs to 
be realized. This will depend on the availability of appropriate treatment facili- 
ties and drug stocks. Crucially, the combination and timing of these processes 
can affect the onward transmission of infectious diseases at the population 
level and hence have an impact on the control of epidemics or progress to- 
wards elimination of endemic diseases. 

This complexity adds to the controversy in assessing the value of diag- 
nostics and often delays the already long process of discovery, development 
and delivery of new technologies for global infectious diseases. This was ad- 
dressed in the 2006 Nature Publishing Group supplement Improved Diagnos- 
tic Technologies for the Developing World, which used modelling techniques to 
define the value of new diagnostic tools for resource poor settings*. Over the 
subsequent 10 years there has been encouraging progress in the development 
and use of new diagnostics, but many gaps remain. By way of a response, this 
collection presents new modelling work that addresses the potential impact of 
diagnostic tools both at the individual and population level. 

This work could not have come at a more crucial time. Over the past 
decade there has been a shift in the epidemiology of infectious diseases, 
with dramatic reductions in burden, which was catalysed by the Millennium 


Development Goals and the associated increase in global health funding’. 
This has been accompanied by a shift from control of diseases in centralized 
health-care settings to prevention and early treatment. Accompanying this 
changing epidemiology, diagnostics are increasingly demanded and used in 
novel health-care paradigms. Technological advances have supported this 
shift. For example, the reach of centralized laboratory testing can be extend- 
ed through the use of specimen collection and stabilization technologies in 
combination with sample transport systems; such as the use of dried blood 
spots for HIV levels, malaria parasite detection and serology®®. In addition, 
new portable and integrated technologies can allow testing at primary health 
facilities, providing greater access to care and adoption’. This was first quanti- 
fied in a seminal study from Mozambique where point-of-care CD4 technolo- 
gies with same-day results enabled a near doubling of patients on treatment, 
owing to the reduction in loss to follow-up that normally occurs when patients 
wait several weeks for results from centralized laboratory testing’. For chronic 
conditions such as HIV, self-testing is also being considered as a method to 
improve testing coverage and ultimately linkage to care". Rapid results and 
ease of use are clearly key characteristics that are affected by the treatment 
paradigm or patient flow. For example, the balance may be shifted towards 
sensitivity over specificity if the reason for testing is to determine onward re- 
ferral rather than immediate treatment. For reasons such as these, the target 
product specification for a diagnostic in these settings is likely to differ to that 
in centralized health-care settings. 

As reported in the 2006 supplement, modelling can be a useful tool to 
capture the health impact that improved diagnostics can have on global health 
efforts. However, although the decision-analytic approach previously adopted 
was appropriate to estimate the impact of a diagnostic at the point of care 
on health outcomes such as cases, deaths and disability-adjusted life years 
(DALYs), in the wider contexts it is also important to capture the effect of 
the diagnostic on onward transmission. Transmission dynamic models are 
well developed for such a purpose and are increasingly used to guide product 
development and public-health decision-making for a wide range of diseases. 
However, the integration of diagnostics into such models is often overlooked. 
To fill this gap, the Diagnostics Modelling Consortium funded by the Bill and 
Melinda Gates Foundation was formed in 2013 to catalyse the incorporation 
of diagnostics into transmission dynamic models across key global health dis- 
eases, including HIV, tuberculosis, pneumonia, malaria and neglected tropical 
diseases. The Consortium and its partners brought together not only model- 
ling groups, but also those involved in diagnostics development and disease 
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Table 1 | Levels of laboratory testing available for public health programmes in different levels of the health system. Adapted from ref. 12. 


Health-care level 


Description 


Appropriate diagnostic or prognostic tools 


0 Informal —‘under the tree’ ¢ First point of care with a community health worker — tool must be simple to use and not require special storage 
¢ Prognostic tools particularly relevant for rapid referral 
1 Primary — health post and centres ¢ Simple diagnostic techniques, including collection of dried blood spots and rapid or dipstick tests 
2 District — district hospital ¢ Actas referral centre for specimens sent from level 1 
e Include dedicated laboratory space, trained technicians and reagents 
¢ Can manage a more extensive test menu for diagnosis and treatment 
3 Regional or provincial — referral hospitals or ¢ Laboratory facilities sufficient to perform complete menu testing for HIV/AIDs, tuberculosis and malaria as well as many other 
part of regional or provincial health bureau diseases 
¢ Typically include level 2 laboratories 


4 National or multicountry — reference 
laboratories for one or more countries 


specialists who could define the strategic needs within these priority disease 
areas. Over an 18-month period the groups worked together to define the 
questions that, when answered, would best inform diagnostic product and 
prognostic tool development, and to extend existing models to address these 
questions. At the same time, the group sought to share experiences and les- 
sons learned across the disease areas. 

Two themes emerged in the subsequent work. The first was the impor- 
tance of considering the patient flow for use of both diagnostic and prognostic 
tools in the wider community. The papers by Floyd et al. and Arinaminpathy 
and Dowdy describe the importance of capturing individuals in the wider 
community who do not promptly seek care. Floyd et al. assess the potential 
impact of a new prognostic device — pulse oximetry — for pneumonia, and 
find that simple medical devices that increase early prognosis of severe pneu- 
monia could have a substantial public health impact as well as being highly 
cost-effective, provided subsequent access to oxygen treatment is available. 
Arinaminpathy and Dowdy also consider this issue, but more broadly, for tu- 
berculosis, arguing that the evaluation of new diagnostics needs to take into 
account not only the sensitivity and specificity of the diagnostic itself, but also 
the impact that it can have on patient behaviour and care seeking. 

Similarly, the Working Group on Modelling of Antiretroviral Therapy Mon- 
itoring Strategies in sub-Saharan Africa explore the potential public health im- 
pact and cost-effectiveness of using viral load measurement to differentiate 
levels of care so that those with a lesser need visit the clinic less often, thereby 
freeing health-care capacity for those in greater need. Despite the limitations 
associated with viral load testing using dried blood spots (and noting that 
point-of-care tests may become available in the future), they find that such an 
approach is cost-effective. In the second HIV article, Sharma et al. present a 
systematic review of the methods that have been used to improve coverage of 
HIV testing. They find that compared with facility-based testing, community 
testing and counselling is a model that identifies HIV-infected individuals at an 
earlier stage of infection (higher CD4 counts). In addition, they find that mo- 
bile and self-testing are even more effective in reaching key population groups, 
including men, young people and those at higher risk. 

Efforts have been made to standardize the settings where diagnostic 
echnologies can be used, given the range of levels that exist — from central- 
ized laboratories to minimally-resourced settings. During a meeting in January 
2008 held in Maputo, Mozambique, the World Health Organization brought 
ogether key stakeholders who were charged with making recommendations 
on laboratory standardization and harmonization’. This group defined four 
iers of the laboratory system (see Table 1), as well as level O or ‘under the 
ree’ — an informal site where diagnostics can and should be used. As patients 
low through these levels, a range of different technologies will probably be 
used to meet their needs, and it will therefore be important to optimize the 
ools’ placement and use based on potential impact. Now that technologies 
are available that can integrate into each laboratory setting, next-generation 
modelling efforts will need to address optimal placement. 

The second theme was the use of new diagnostic technologies to target 
interventions with the purpose of disease elimination. This comes at a time 
when new global commitments for the elimination of diseases have been 
made, including a call in 2007 by Bill and Melinda Gates to move towards ma- 
laria eradication’? (and subsequent inclusion of elimination goals in the World 
Health Organization Global Technical Strategy’*) alongside the elimination 
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Strengthen laboratory capacity for all diseases of concern and provide molecular and esoteric testing that cannot be performed in 
level 2 laboratories, for example nucleic acid assays, HIV drug resistance studies and tuberculosis drug susceptibility studies 


goals set out for neglected tropical diseases under the London Declaration”. 
Diagnostic needs for elimination pose new challenges. First, as the disease 
declines to low levels, identifying remaining foci of infection is important. For 
this, diagnostic tools need to be sufficiently sensitive to detect the remain- 
ing reservoir of infection. However, as the end point is reached, and in the 
subsequent maintenance phase, identifying infected individuals becomes cru- 
cial. For this, highly sensitive and rapid diagnostic tests are required. In both 
the elimination and the maintenance phase, diagnostics have a crucial role 
in the overall surveillance strategy, providing the first indication of potential 
for re-emergence. However, for many diseases appropriate diagnostics for 
this phase are not yet developed, with major challenges remaining in relating 
the different biomarkers to disease status. For example, although substantial 
progress has been made in moving towards the elimination of onchocercia- 
sis in West Africa, current diagnostic tools — such as skin-snips to detect 
micro-filariae and nodule palpation to identify foci of transmission — are un- 
likely to identify very low levels of infection and thus may have insufficient 
sensitivity to prevent resurgence’®. Hence, for diseases such as this it is likely 
that a combination of diagnostic tests will need to be used, each targeted to 
the appropriate stage of transmission. 

These issues are addressed in two related articles on Plasmodium falci- 
parum malaria. Wu et al. undertake an extensive review of the relationship 
between current diagnostic tools used in endemic settings. They find that 
current rapid diagnostic tests detect only 41% of infections detected by 
high-sensitivity polymerase chain reaction (PCR) techniques, indicating that a 
substantial number of infections will be missed if this diagnostic is used in the 
field and implying that a large infectious reservoir remains. Slater et al. contin- 
ue this theme to estimate the target product specifications for new diagnostic 
tests that aim to reduce onward transmission. They find that increasing the 
sensitivity of the current rapid diagnostic test tenfold could detect 83% of the 
infectious reservoir. Applying this strategy to settings in sub-Saharan Africa 
and Asia, the authors demonstrate that increase in sensitivity could widen the 
areas in which mass screen-and-treat programmes and targeted mass drug 
administration could succeed in interrupting transmission. 

Medley et al. take a similar approach to assessing the role of diagnostics 
for visceral leishmaniasis — concentrating on the potential for elimination in 
the Indian subcontinent. They find that shortening the time from health seek- 
ing to diagnosis could dramatically reduce incidence, especially if a diagnostic 
can be developed that is able to detect infected individuals before the onset of 
clinical kala-azar. The study also highlights the importance of bringing model- 
lers and scientists together to develop diagnostic tools early on in the devel- 
opment process. Given the overall poor understanding of the aetiology and 
transmission biology of pathogens such as Leishmania, modelling can help to 
identify key parameters for which further data are needed. In turn, the data 
collected can be used to refine the models in subsequent iterations, potential- 
ly speeding up the process of diagnostic development. 

Finally, while the work of the Consortium was underway, the world expe- 
rienced the unprecedented spread of Ebola virus disease across West Africa. 
During the subsequent global health response it became apparent that the 
reliance on PCR-based diagnostics resulted in significant delays in diagnosis. 
Nouvellet et al. review the development of rapid diagnostic tests for Ebola vi- 
rus disease over the past year and use modelling to explore the potential ben- 
efits of such tests. Their results demonstrate the key role of rapid diagnostics 
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to guide individual treatment decisions, while also reducing the potential scale 
of future epidemics. 

Although the outputs of the Diagnostics Modelling Consortium presented 
in this supplement clearly demonstrate the impact and cost-effectiveness of 
new diagnostic approaches for multiple diseases of global health significance, 
the impact of diagnostics remains overlooked. The results can be grave, rang- 
ing from overestimation of the impact of interventions when perfect diagnosis 
is assumed to ignoring the potential role of a diagnostic tool to facilitate low- 
er-cost approaches to treatment. This supplement pulls together a range of 
articles that highlight the importance of considering the individual- and pop- 
ulation-level aspects of the use of diagnostics, encouraging a shift in mindset 
for all infectious-disease modelling moving forward. In addition, the interdis- 
ciplinary nature of this work should not be underestimated. Bringing together 
the key scientific, clinical and strategic perspectives is imperative from the 
start of any effort to develop and use technology. Modelling, even at its best, 
is only a way to describe and quantify our thoughts. To truly determine the 
impact of diagnostics technologies, they must be evaluated in the field. Only 
then can we place the appropriate diagnostic and prognostic tools in the right 
settings to achieve our global health goals. 

The editorial process for this supplement was coordinated by Azra Ghani 
and Alison Reynolds of the Diagnostics Modelling Consortium. We thank Deir- 
dre Hollingsworth, Tim Hallett and Nilufar Hampton for their assistance. We 
also thank the many anonymous peer reviewers for their support in this process. 
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Evaluating the impact of pulse oximetry on 
childhood pneumonia mortality in 


resource-poor settings 
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It is estimated that pneumonia is responsible for 15% of childhood deaths worldwide. Recent research has shown that hypoxia 
and malnutrition are strong predictors of mortality in children hospitalized for pneumonia. It is estimated that 15% of children 
under 5 who are hospitalized for pneumonia have hypoxaemia and that around 1.5 million children with severe pneumonia re- 
quire oxygen treatment each year. We developed a deterministic compartmental model that links the care pathway to disease 
progression to assess the impact of introducing pulse oximetry as a prognostic tool to distinguish severe from non-severe pneu- 
monia in under-5 year olds across 15 countries with the highest burden worldwide. We estimate that, assuming access to sup- 
plemental oxygen, pulse oximetry has the potential to avert up to 148,000 deaths if implemented across the 15 countries. By 
contrast, integrated management of childhood illness alone has a relatively small impact on mortality owing to its low sensitivity. 
Pulse oximetry can significantly increase the incidence of correctly treated severe cases as well as reduce the incidence of incor- 
rect treatment with antibiotics. We also found that the combination of pulse oximetry with integrated management of childhood 
illness is highly cost-effective, with median estimates ranging from US$2.97 to $52.92 per disability-adjusted life year averted 
in the 15 countries analysed. This combination of substantial burden reduction and favourable cost-effectiveness makes pulse 


oximetry a promising candidate for improving the prognosis for children with pneumonia in resource-poor settings. 
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espite interventions being available, it is estimated that pneumonia 

is responsible for 15% of childhood deaths worldwide'. Reductions in 

annual mortality remain modest, with nearly 950,000 under-5 year 
olds dying of pneumonia in 2013 (ref. 2). Despite the unprecedented rate of 
Haemophilus influenzae type B (Hib) and pneumococcal vaccine (PCV) in- 
troduction, achieving high levels of coverage in developing countries is still 
challenging’. Therefore, in regions where vaccine introduction and scale-up 
lags behind other countries, improved access to diagnosis and treatment is 
crucial. This includes interventions at multiple points in the continuum of care 
— improving care-seeking practices, increasing the availability of suitable di- 
agnostics, and guiding both formal and informal care providers in appropriate 
disease management. Unfortunately, current treatment coverage remains low, 
and, more importantly, most childhood pneumonia deaths result from a lack 
of, or delay in, accurate diagnosis’. 

A crucial component of improving pneumonia outcomes is the early iden- 
tification of patients at risk of treatment failure and the timely provision of 
supportive care. However, in the absence of appropriate prognostic tools at 
the frontline, currently recommended World Health Organization (WHO) 
guidelines for integrated management of childhood illness (IMCI) often lead 
to an overuse of antibiotics and the under-referral of patients with severe 
pneumonia who require hospital care®. The most recent 2015 technical up- 
date of IMCI guidelines defines non-severe pneumonia as the presence of fast 
breathing or chest in-drawing or both, which is treatable with oral antibiotics. 


Severe pneumonia is defined as cough or difficulty breathing in the presence of 
danger signs, and requires referral to a hospital or health facility for injectable 
antibiotics or other supportive care such as oxygen therapy®. Currently, iden- 
tification of these IMCI symptoms remains inconsistent and unreliable among 
community health-care workers or carers without clinical training’. Therefore, 
improved prognostic and diagnostic tools for case-management are neces- 
sary to substantially reduce pneumonia-associated morbidity and mortality. 
Hypoxaemia and malnutrition are strong predictors of mortality in children 
who are hospitalized for pneumonia®’. This has led to increasing support for 
the use of oxygen therapy and monitoring oxygen saturation in the manage- 
ment of severe cases. It is estimated that 15% of children who are hospitalized 
for pneumonia have hypoxaemia (oxygen saturation, or SpO2, of <90% (ref. 
10) and that around 1.5 million children with severe pneumonia require oxygen 
treatment each year". The use of pulse-oximetry devices (used to measure the 
oxygen level in the blood) in community health-care settings has been proposed 
as a method to identify hypoxic children at risk of treatment failure. These de- 
vices may be particularly beneficial at the frontline given that they require little 
training and reduce the reliance on clinical symptoms. The current pulse-oxi- 
metry systems are also quick, non-invasive and require minimal infrastructure. 
The aim of this study was to evaluate the public health impact and 
cost-effectiveness of current IMCI guidelines combined with pulse-oximetry 
devices as a prognostic tool in the hands of frontline health workers in re- 
source-poor settings. To do this, we developed a model of disease progression 
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Figure 1 | Overview of model structure. The main states and transitions of the model are shown, with cases transitioning at time-dependent rates. Each state contains 


a subset of states to track the length of infection. 


that explicitly tracks the continuum of care pathways, and parameterized the 
model for the top 15 countries with the highest burden of pneumonia. 


METHODS 
Model structure. The progression and treatment of pneumonia in a population of 
children under the age of five was modelled using a continuous-time determin- 
istic compartmental model (Fig. 1). Full mathematical details of the model are 
given in the Supplementary Information. Without treatment, children may be in 
one of four states: susceptible (S), non-severe (NSV), severe (SV) or death (D). 
These definitions of non-severe and severe disease are different to those defined 
by IMCI. We classified severe disease as disease that requires hospitalization 
and non-severe disease as disease that can be successfully managed with oral 
antibiotics alone. Susceptible children become infected at a constant rate deter- 
mined by the incidence of pneumonia in the population, and although most enter 
the non-severe state before progressing, a small proportion progress directly to 
severe disease. Children with non-severe disease may recover naturally (without 
treatment), progress to a treatment state or progress to the severe state. Chil- 
dren with severe disease may recover naturally, progress to a treatment state, or 
die and be tracked in the death state. Each of the non-severe disease and severe 
disease states are further subdivided into day of infection. Children with non-se- 
vere pneumonia who have not progressed to severe disease or recovered natu- 
rally after 14 days return to the susceptible state, whereas children with severe 
pneumonia who have not received treatment or recovered naturally after 14 days 
(giving a maximum potential illness length of 28 days) are assumed to have died. 
As well as making a natural recovery, non-severe and severe cases may 
separately enter one of two treatment states: treatment received outside of 
hospital (referred to as community-based treatment) or hospital-based treat- 
ment, at rates that depend on the care-seeking rate and factors in the care 
pathway. The latter is determined by a decision tree in which four factors 
are included: availability of a prognostic tool, whether or not the prognostic 
tool gives the correct result, adherence to the prognosis (whether or not the 
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correct treatment, according to the prognosis, was administered by the med- 
ical practitioner and followed by the patient) and treatment availability. An 
example of a section of the decision tree is given in Supplementary Figure 1. 
We aimed to capture correct and incorrect treatment rates of both non-severe 
and severe cases, So cases may move into either treatment state depending on 
the outcome of the decision tree, regardless of whether or not it is the correct 
treatment. Moreover, this design includes the possibility that cases can move 
into the correct treatment state even if the prognosis was incorrect. 

Last, if treatment fails to work, then the case may move into one of two 
treatment failure states, determined by a probability of treatment success that 
depends on both the severity of the disease and whether the treatment is ap- 
propriate. For example, severe cases are less likely to be cured by communi- 
ty-based treatment than hospital-based treatment. Children with non-severe 
disease that fail to respond to treatment may progress to hospital treatment, 
to a severe state or naturally recover to the susceptible state. Children with 
severe disease who fail to respond to treatment may progress to hospital 
treatment, to the death state or naturally recover to the susceptible state at 
different rates to the non-severe cases. 

Community-based treatment is assumed to consist of a course of amox- 
icillin that lasts for 3 days, whereas hospital-based treatment lasts for 7 days. 
To ensure that treatment does not prolong illness in the model, those who 
enter the treatment states at a late stage of infection (such as day 13 or 14 
of non-severe illness) are assumed to recover before reaching the end of the 
treatment, instead of progressing through all 3 or more days of treatment and 
therefore taking longer to return to susceptible than if they are left untreated 
(see Supplementary Fig. 2). 


Model parameters. The public health impact of the introduction of pulse-oxi- 
metry devices was evaluated at the community level in comparison to a baseline 
standard of care with IMCI by calculating the incremental deaths averted with 
the introduction of pulse oximetry in different countries. Countries included in 
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Table 1 | Model parameters. The central estimates shown are derived from the source literature with ranges added for the sensitivity analysis. 


Parameter 


Value (range) 


Sources 


Incidence 

Proportion severe on day 1 5% (2-10%) 
Mean duration of non-severe illness before recovery 

Mean duration of non-severe illness before progression to severe illness 
Mean duration of severe illness before recovery 


Mean duration of severe illness before death 


Proportion bacterial versus viral (NSV) 


Country-specific + 10% 


3 days (2-4 days) 
10 days (9-11 days) 
4 days (3-5 days) 
7 days (6-8 days) 
85% viral (75-90%) 


Ref. 12 
Ref. 13 
Ref. 25 
Estimated from model (see Methods) 
Ref. 22 
Ref. 26 
Ref. 27 


15% bacterial (25-10%) 


Proportion bacterial versus viral (SV) 


85% bacterial (75-90%) 


Assumed 


15% viral (25-10%) 


Mean duration of illness before care seeking NSV 3 (2-4) days Ref. 26 
SV 0.75 (0.5-1) days 
Probability that community-based treatment is available Country-specific + 10% Ref. 3 
Probability that timely hospital access 0.61 + 10% Ref. 24 
Probability of community-based treatment curing non-severe bacterial case 0.925 (0.90-0.95) Ref. 28 
Probability of treatment with hospital care curing case 0.925 (0.80-0.95) Assumed to be high if oxygen is available with lower values representing poorer 
standard of care 
Probability of treatment with amoxicillin curing severe case if prescription adhered to 0.65 (0.6-0.7) Ref. 29 (based on treatment failure rates of patients with hypoxia at baseline) 


Probability of prognostic available 1(0.9-1) Assumed to be high for the purpose of this analysis 

Sensitivity of IMCI 0.55 (0.5-0.6) Ref. 30 

Sensitivity of PO1 0.7 (0.65-0.75) Estimated 

Sensitivity of PO2 0.85 (0.8-0.9) Ref. 14 

Specificity of IMCI 0.85 (0.8-0.9) Assumed to be high given low overall referral rates 

Specificity of PO1 and PO2 0.85 (0.8-0.9) Assumed to be similar to IMCI 

Adherence to non-severe prognosis (IMCl) 0.55 (0.5-0.6) Refs 31-33 

Adherence to severe prognosis (IMCl) 0.65 (0.6-0.7) Refs 31,32 

Adherence to non-severe prognosis (PO1 and PO2) 0.55 (0.5-0.6) Assumed to be similar to IMCI 

Adherence to severe prognosis (PO1 and PO2) 0.85 (0.8-0.9) Assumed to be high for the purpose of this analysis 

Prognosed SV treated with community-based treatment versus nothing 1 Assumed that prognosed SV will always be treated even if not referred to hospital 
Prognosed NSV that is hospitalized versus receiving nothing 0.025 (0.01-0.05) Assumed that prognosed NSV are unlikely to be incorrectly hospitalized 


IMCI, integrated management of childhood illness; NSV, non-severe pneumonia; PO1, IMCI and pulse oximetry combination with a sensitivity of 70%; PO2, IMCI and pulse oximetry combination with a 


sensitivity of 85%; SV, severe pneumonia. 


Table 2 | Cost parameters. The central estimates are shown derived from the source literature with ranges added for the sensitivity analysis. 


Parameter 


Amoxicillin treatment per child 


Value (range) 


US$0.1614 drug cost plus 


Sources 


Ref. 34 


country-specific delivery cost 


(+ 10%) 


Average hospital cost per episode 


Pulse oximeter $250 
Batteries $2 ($1.5-2.5) 
Uses per set of batteries 840 

Lifetime of device 2 years 


Number of devices needed 
(0.81.2) 


Delivery and distribution costs 20% 


the analysis were selected according to the number of pneumonia-attributed 
deaths that occur each year in that country, using estimates from 2011 (ref. 12). 
Although more recent estimates of pneumonia mortality in under-fives are 
available, these did not include estimates of the number of cases and so the 
2011 estimates were used for the analysis. The 15 countries with the highest 
number of deaths were chosen, excluding China, Angola and Tanzania owing 
to a lack of data on the availability of community-based care in these countries. 
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Country-specific (+ 10%) 


1 per 1,000 children under 5 


Ref. 15 
Ref. 35 
Assumed 
Ref. 36 
Ref. 36 
Assumed 


Assumed 


The incidence of pneumonia in the model was fitted to the mortality data 
by finding the incidence rate that best matches the mortality data using a nor- 
mal likelihood. The rate of progression from non-severe to severe disease was 
fitted to estimates of proportions of cases that progress to severe disease’?”. 
All other disease progression parameters were based on a review of the lit- 
erature and expert opinion (Table 1). The availability of community-based 
care was fitted simultaneously to data on the percentage of children with 


S55 


PULSE OXIMETRY | FLOYD ETAL. 


B Deaths per 1,000 
@ Incidence per 1,000 


Deaths per 1,000 children per year 
N 
=) 
=) 
Disease incidence per 1,000 children per year 


104 B No prognostic 
94 w IMCI 

@PO1 

BPO2 


Deaths per 1,000 children per year o& 


0 
TS S  2® LS ~®@ D@ ~w GL @ BY 
& £ Rs & eS ; gs Ra s we & oe Ss a & 
oo PF OTS roe ye 
” = s Ss 


Figure 2 | Pneumonia incidence and mortality. a, Estimated under-5 pneumonia 
incidence predicted by the model and previously reported mortality” in top 15 
countries with the highest burden. b, Median estimates of pneumonia deaths 
per 1,000 children under 5 across the 15 countries with the highest mortality for 
four different community-level prognostics — none, integrated management of 
childhood illness (IMCI), combined IMCI and pulse oximetry with 70% sensitivity 
(PO1) and combined IMCI and pulse oximetry with 85% sensitivity (PO2). DRC, 
Democratic Republic of the Congo. 


suspected pneumonia receiving antibiotics?, assuming a binomial likelihood, 
whereas the other care-seeking parameters were obtained from the literature 
where available (Table 1). One care-seeking parameter — the probability of 
treatment with hospital care curing the case — could not be identified in the 
literature. For the purpose of assessing the impact of the pulse oximeter as 
a prognostic tool, we assumed that this was high, representing a situation in 
which oxygen and other facilities are available. Lower values of this parameter 
will reduce the impact and cost-effectiveness of any prognostic tool; this was 
explored in our sensitivity analysis. 

To estimate the impact and cost-effectiveness of a new prognostic com- 
bination (IMCI and pulse oximetry), we also needed to make a number of 
assumptions about its availability, its ability to accurately classify a case as 
severe (sensitivity) or non-severe (specificity) compared with IMCI alone, and 
adherence to its use. For the purposes of assessing its utility, we assumed it 
would be made available and hence set this parameter to a high value. Al- 
though data were available to support the sensitivity of IMCI, sufficient data 
were not available to inform the sensitivity of IMCI when combined with pulse 
oximetry. Thus, we proposed two scenarios: one in which the addition of pulse 
oximetry increases the sensitivity of IMCI to 70% (referred to as PO1), and 
one in which the sensitivity of the combination is increased to 85% (referred 
to as PO2), reflecting the potential of pulse oximetry to identify both people 
with hypoxic cases and cases with abnormal oxygen saturation (90-95%) 
who would benefit from referral'*. We could not find data on specificity and 
assumed these would be relatively high (85%) for both IMCI alone and the 
PO1 and PO2 prognostic packages (Table 1). Adherence to IMCI guidelines 
for both non-severe and severe cases was based on the literature, and we set 
severe prognosis adherence to the PO1 and PO2 prognostic packages to be 
higher (increased from 65% to 85%), reflecting the perception that a phys- 
ical tool would increase the likelihood of adherence. Finally, two parameters 
are included to account for non-adherence to the prognosis. For those whose 
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prognosis is severe disease, if the mother did not take the child to hospital de- 
spite referral we assumed that as a minimum the child would receive amoxicil- 
lin (taking into account its availability). For those with a non-severe prognosis, 
but for whom the treatment regimen was not adhered to, we assumed that 
they had a high probability of receiving no treatment. 


Costing approach. The incremental cost-effectiveness of pulse oximetry was 
evaluated in comparison to a baseline of using IMCI alone. Costing was under- 
taken from a public health provider perspective and hence no societal, econom- 
ic or private sector costs were included. 

Costs were subdivided into two categories. Additional direct costs included 
the cost of prognostic equipment, batteries and delivery and distribution of the 
prognostic tool. Further health-care costs included additional amoxicillin cours- 
es for patients with non-severe disease, increased hospitalization costs for pa- 
tients with severe disease and the cost savings that would arise from the reduc- 
tion in inappropriate amoxicillin prescriptions or treatment (owing to assumed 
increased prognostic adherence) or hospital referral. Total costs are the direct 
costs plus the additional health-care costs. Wherever possible, costing data 
were obtained from the literature (Table 2). Health-care costs were obtained at 
the country level from the WHO-CHOICE database”. This included the average 
cost of an outpatient visit and the cost of inpatient stays based on an average of 
7 days of hospitalization. The cost implications arising from potential overuse 
of amoxicillin (antibiotic resistance) were not included in the analysis given the 
difficulty of obtaining any meaningful quantitative estimate of this'®. Disabili- 
ty-adjusted life years (DALYs) are calculated using country-specific life-expec- 
tancy values and are not discounted or age-weighted, as suggested by recent 
recommendations”. The overall cost-effectiveness compared with IMCI is then 
presented as the cost per DALY averted compared with IMCI. 


Sensitivity analysis. To explore sensitivity to key parameters, Latin hypercube 
sampling was used to draw 1,000 different parameter values from a triangular 
distribution. Outputs from the model were calculated using each set of pa- 
rameters and the top and bottom 5% of the results were discarded to obtain 
90% ranges. Additional sensitivity analyses are reported in the Supplementary 
Information. 


RESULTS 


Figure 2a shows previous annual pneumonia mortality estimates’ and our es- 
timates of the incidence of pneumonia per 1,000 children in the 15 countries 
with the highest burden. As expected, the estimated pneumonia incidence fol- 
lows a similar pattern to mortality, but with variations owing to between-coun- 
try variation in the availability of community-based care. Niger, Ethiopia and 
India are estimated to have particularly low rates of antibiotic treatment™, and 
thus are predicted by the model to have the highest case-fatality rates. 

The predicted mortality across the 15 countries under different types of 
prognostic scenarios are shown in Figure 2b. In comparison to a baseline sce- 
nario of no prognostic tool (modelled as a prognostic with a sensitivity and 
specificity of 50%), IMCI is predicted to have a small incremental impact on 
mortality. This is largely driven by the fact that the sensitivity of IMCI in identi- 
fying severe cases is only 50-60%, and hence not much greater than ‘chance’. 
By contrast, we predict that distribution of pulse oximetry to affected commu- 
nities could result in much greater reductions in mortality. This is driven by both 
the higher sensitivity of pulse oximetry (65-75% for PO1, 80-90% for PO2) 
compared with IMCI (50-60%) and our assumption that there would be a 
higher adherence rate to PO1 and PO2 (80-90% for severe cases) compared 
with IMCI (60-70% for severe cases). The countries with the poorest rates of 
community-based treatment have the greatest reduction in mortality when only 
IMCI is considered (compared with no prognostic tool); whereas for PO1 and 
PO2, the countries with the highest incidence of disease are predicted to have 
the greatest reduction in mortality compared with no prognostic tool used. 

The reduction in mortality under IMCI, PO1 and PO2 compared with 
the absence of a prognostic tool can be translated directly into estimates of 
deaths averted per 1,000 children at risk. Our estimates of the number of 
deaths averted by PO1 and PO2 is substantially higher than those estimated 
to be averted by IMCI, with the highest impact per 1,000 children in Soma- 
lia, Mali and Niger (Fig. 3a). Although the effect of IMCI on deaths averted 
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Figure 3 | Deaths averted by prognostic tools across 15 countries. a, Estimated deaths averted per 1,000 children per year under integrated management of 
childhood illness (IMCI), PO1 (combined IMCI and pulse oximetry with 70% sensitivity) and PO2 (combined IMCI and pulse oximetry with 85% sensitivity) each 
compared with a baseline of no prognostic. b, Estimated absolute number of deaths averted by PO1 and PO2 compared with IMCI alone. Error bars show 90% range 


from the sensitivity analysis. DRC, Democratic Republic of the Congo. 


is small relative to pulse oximetry, there is still a benefit compared with the 
absence of a prognostic tool. The estimated impact of pulse oximetry is 
more apparent when translated to absolute values scaled to country-specif- 
ic under-5 population sizes (Fig. 3b). In absolute terms, the introduction of 
pulse-oximetry devices is estimated to result in the greatest annual reduc- 
tions in pneumonia deaths in India (75,500 deaths averted per year for PO2) 
and in Nigeria (15,400 deaths averted per year for PO2) owing to their large 
under-five populations (128 million and 27 million, respectively). Collectively, 
we estimate that the implementation of PO1 (the more conservative estimate 
of IMCI and pulse-oximetry sensitivity) could avert 103,000 (90% range = 
77,000-135,000) deaths annually across the 15 countries with the highest 
burden. For PO2, this increases to 148,000 (90% range = 112,000-193,000). 

A key aim of improved prognostic tools is to increase the number of pa- 
tients with severe disease receiving correct hospital referral. Using Nigeria as a 
country-level example, we estimated that the proportion of people with severe 
cases receiving hospital referral could increase by 44% by implementing PO1 
or 62% by implementing PO2. We also estimated a substantial reduction in 
incorrect treatment — with the number of people with severe disease receiv- 
ing community-based care alone (under treatment) decreasing by 19% (PO1) 
and 25% (PO2). However, a small increase in number of people with non-se- 
vere cases who receive hospital referral (over-treatment) was also predicted 
(from 3.99% of cases to 5.08% for both PO1 and PO2); this is due to the 
assumption that higher adherence is associated with a severe prognosis made 
by PO1 or PO2 compared with IMCI alone. 

Increasing the sensitivity of the prognostic tool — which is assumed from 
adding pulse oximeters to existing IMCI — only has a substantial effect if oth- 
er aspects of the health system are functioning at reasonable levels. Using 
Nigeria as a country-level example, we identified four key variables that deter- 
mined the additional impact of a pulse oximeter — the availability of amoxicil- 
lin, hospital care, oxygen within the hospital and the prognostic tool. 

The reduction in mortality is slightly more sensitive to the availability of 
community-based care than the availability of hospital-based care (Fig. 4a). 
This is due to two assumptions: if a person is referred to hospital, but is unable 
to access hospital care, then community-based care may instead be accessed; 
and community-based care includes the provision of a full course of amoxicil- 
lin. Therefore, if hospital-based care is unavailable, cases will probably receive 
amoxicillin instead, and a proportion of these will be cured by amoxicillin. 
Providing a more sensitive prognostic tool only had a substantial impact on 
mortality when the availability of hospital care was greater than 20%, when 
the oxygen availability exceeded 60% and when the prognostic tool was 
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available in more than 60% of communities (Fig. 4a). We further investigated 
the impact of reduced oxygen availability on both deaths averted and cost-ef- 
fectiveness (Fig. 4b) and found that oxygen availability (parameterized as the 
hospital cure rate) needs to be at least 60% for deaths to be averted by PO2 
compared with IMCI alone. For cost-effectiveness to be less than US$40 per 
DALY averted, oxygen availability should exceed 70% (Fig. 4b). 

PO1 and PO2 are assumed to have both higher sensitivity and higher ad- 
herence to severe prognosis than IMCI alone. To assess the relative contribu- 
ions of these two parameters, the change in incidence of cases in each of the 
our treatment states compared with IMCI was calculated for five scenarios: 
with the increase in adherence to a severe prognosis only; the increase in sensi- 
ivity for PO1 only; the PO1 combination of increased adherence and sensitivity; 
he increase in sensitivity for PO2 only; and the PO2 combination of increased 
adherence and sensitivity (Supplementary Fig. 3). We found that increasing 
each parameter alone had a substantial effect on the incidence of deaths and 
reated cases. Increased adherence to a severe prognosis alone caused more 
non-severe cases to be incorrectly treated owing to the poor sensitivity of IMCI 
hat caused some non-severe cases to be given a prognosis of severe — these 
are then treated because of the higher adherence. The increase in severe prog- 
nosis adherence resulted in a small increase in incorrect treatment for non-se- 
vere cases (relative to the total number of non-severe cases), but a substantial 
increase in correct treatment for severe cases (relative to the total number of 
severe cases). The combination of increased adherence and prognostic sensi- 
tivity had the largest impact on the correct treatment of severe cases. 

Across all the countries studied, community-based care costs (US$0.56- 
3.70 per course of amoxicillin) are small in comparison to the corresponding 
hospital costs ($6.44-130.34 for a 7-day inpatient stay"). The estimated cost of 
the intervention itself (approximately $165 per 1,000 children per year) is also 
estimated to be lower than the additional health-care costs in most of the coun- 
tries. As such, overall increases in health-care costs under pulse oximetry are 
largely associated with higher hospital referral rates for severe pneumonia cases. 

The cost-effectiveness of implementing pulse oximetry in the 15 countries 
with the highest burden is shown in Figure 5. PO1 and PO2 are most cost-ef- 
fective in Niger ($3.72 and $2.97 per DALY, respectively), the Democratic 
Republic of the Congo ($6.81 and $4.81 per DALY, respectively) and Ethiopia 
($6.57 and $5.00 per DALY, respectively), partly driven by the comparatively 
lower costs of hospital care in those countries. For comparison, an insecti- 
cide-treated mosquito net (ITN) to prevent malaria is estimated to cost be- 
tween $5 and $31 per DALY averted", whereas HIV antiretroviral therapy has 
been estimated to cost upwards of $150 per DALY averted?°. 
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Figure 4 | Sensitivity analyses. a, The relationship between prognostic scenario 
parameters, prognostic sensitivity and the estimated annual pneumonia deaths 
per 1,000 children under 5 under varying prognostic scenarios: 1) the availability 
of amoxicillin (community-based care); 2) the availability of hospital care; 3) the 
availability of the prognostic tool; and 4) the availability of oxygen (hospital care). 
All other parameters were fixed at their central values. b, The impact of oxygen 
availability on cost-effectiveness and deaths averted, assuming a combined 
integrated management of childhood illness and pulse oximetry sensitivity of 
85% (PO2). All scenarios are modelled using country-specific parameters for 
Nigeria. DALY, disability-adjusted life year. 


DISCUSSION 

Using a simple model that links care pathways to the progression of pneumonia 
in young children, we predict that a combination of pulse oximetry with current 
IMCI guidelines has the potential to avert up to 148,000 deaths per year in 
the 15 countries with the highest burden of pneumonia across Africa and Asia, 
under the assumption that there is more than 90% prognostic tool and sup- 
plementary oxygen availability. This equates to about one-sixth of all deaths 
owing to community-acquired pneumonia in the developing world. For com- 
parison, it has been estimated that complete elimination of low birth weight 
would prevent 25% of pneumonia deaths in developing countries, with a sim- 
ilar proportion prevented by eliminating malnutrition”. Analysis of the impact 
of the pneumococcal vaccine for infants, PCV10, predicted that the vaccine has 
the potential to directly avert around 262,000 deaths in under-5s across 72 
countries”. The relative ease of implementation of a pulse oximetry-based in- 
tervention (even with the assumption of perfect availability) compared with the 
elimination of low birth weight or malnutrition makes it an important candidate 
for an intervention against pneumonia in resource-poor settings. 

On top of the large reduction in deaths, we predict that the addition of pulse 
oximetry to IMCI has the potential to increase the correct treatment of severe cas- 
es by an estimated 44%. When modelling the effect of PO1 and PO2 compared 
with IMCI, we increased two key parameters to simulate the implementation of 
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Figure 5 | Cost-effectiveness of prognostic tools. Estimated cost-effectiveness 
(US$ per disability adjusted life year (DALY)) of PO1 (combined integrated 
management of childhood illness (IMCI) and pulse oximetry with 70% sensitivity) 
and PO2 (combined IMCI and pulse oximetry with 85% sensitivity) compared 
with IMCI in the 15 countries with the highest burden of pneumonia. The 
numbers indicate the median estimate whereas bars represent the 90% range. 
DRC, Democratic Republic of the Congo. 


pulse oximetry. These were prognostic sensitivity (which was set to be higher for 
PO1 and PO2 than for IMCI) and adherence to a severe prognostic result (also 
higher for PO1 and PO2 than for IMCI). Both substantially contribute to an in- 
crease in the correct treatment of severe cases and thus the predicted reduction 
in pneumonia deaths. Sensitivity analysis showed that an increase in either of 
these characteristics alone has the potential to prevent deaths. These substantial 
burden reductions are explained by the relatively low sensitivity of IMCI for de- 
tecting severe cases (just 55% compared with a potential 70-85% for pulse ox- 
imetry combined with IMCI), and the very high burden of pneumonia in these 15 
countries (910,000 deaths attributed to pneumonia in under-5s in 2010 (ref. 12). 
The incremental cost-effectiveness of PO1 and PO2 over IMCI was found 
to be very low in 14 of the 15 countries (less than $30 per DALY averted). For 
reference, the gross domestic product (GDP) per capita across these 14 coun- 
tries ranged from $400 to $3,000 in 2013. Compared with the cost-effective- 
ness of the distribution of PCV10, estimated to be $100 per DALY averted??, 
this seems to be remarkably favourable. However, the extra costs of providing 
oxygen support were not taken into account in the calculations of cost-effec- 
tiveness, owing to a lack of data on the availability of oxygen support at the 
country level. Including the costs in the analyses will decrease the cost-effec- 
tiveness. Nevertheless, we predict that when coupled with the additional costs 
of oxygen support, pulse oximetry will still compare favourably with the PCV10 
vaccine. For example, a study in Papua New Guinea estimated the cost-effec- 
tiveness of improving oxygen support in this area (including oxygen concentra- 
tors and the provision of pulse oximeters) to be $50 per DALY averted”’. 
There were several limitations to our analysis. One of these was the lack 
of country-specific data to inform our parameter for access to hospital care. 
We assumed that 61% of people who were referred to hospital would access 
hospital care, based on data from a retrospective case review study of chil- 
dren with severe pneumonia in Tanzania. More realistically, we know that the 
proportion of children reaching an appropriate health facility may vary signifi- 
cantly between countries and so having one single parameter for all countries 
could result in inaccurate estimates. However, our sensitivity analysis showed 
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that the model is less sensitive to hospital access than community-based care 
access. Another limitation was a lack of data on the availability of oxygen sup- 
port across the 15 countries and how it is distributed throughout the health 
system. Linked to this was our assumption that the hospital systems in each 
country were substantial enough to support the extra cases that would be hos- 
pitalized. More field data on the hospital systems in each country is required 
to inform and expand the model to appropriately address these limitations. 
Nevertheless, it is clear that for any new prognostic to have impact there is the 
need to also invest in strengthening the existing primary and tertiary health- 
care facilities so that appropriate care is provided to those that are referred. 
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Understanding the incremental value of novel 
diagnostic tests for tuberculosis 


Nimalan Arinaminpathy' & David Dowdy? 


Tuberculosis is a major source of global mortality caused by infection, partly because of a tremendous ongoing burden of undi- 
agnosed disease. Improved diagnostic technology may play an increasingly crucial part in global efforts to end tuberculosis, but 
the ability of diagnostic tests to curb tuberculosis transmission is dependent on multiple factors, including the time taken by a 
patient to seek health care, the patient’s symptoms, and the patterns of transmission before diagnosis. Novel diagnostic assays for 
tuberculosis have conventionally been evaluated on the basis of characteristics such as sensitivity and specificity, using assump- 
tions that probably overestimate the impact of diagnostic tests on transmission. We argue for a shift in focus to the evaluation 
of such tests’ incremental value, defining outcomes that reflect each test’s purpose (for example, transmissions averted) and 
comparing systems with the test against those without, in terms of those outcomes. Incremental value can also be measured in 
units of outcome per incremental unit of resource (for example, money or human capacity). Using a novel, simplified model of tu- 
berculosis transmission that addresses some of the limitations of earlier tuberculosis diagnostic models, we demonstrate that the 
incremental value of any novel test depends not just on its accuracy, but also on elements such as patient behaviour, tuberculosis 
natural history and health systems. By integrating these factors into a single unified framework, we advance an approach to the 
evaluation of new diagnostic tests for tuberculosis that considers the incremental value at the population level and demonstrates 


how additional data could inform more-effective implementation of tuberculosis diagnostic tests under various conditions. 
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very year, nearly three million people develop active tuberculosis (TB), but 

are not notified to health authorities'. Some of these individuals may spon- 

taneously resolve their disease, die or be treated in the private sector, but 
many remain infectious, fuelling ongoing transmission in the community. Reach- 
ing this ‘missing three million’ remains one of the top priorities for global TB con- 
trol?. A widely cited reason for the ongoing gap between incidence and cases noti- 
fied is the lack of highly sensitive and deployable diagnostic tests for TB*. Sputum 
smear microscopy, the global cornerstone of TB diagnosis*, can miss half of all 
people with infectious TB®, whereas more sensitive tests cannot routinely be im- 
plemented at the point of treatment®”. Nevertheless, the link between improved 
diagnostic sensitivity and better TB detection remains uncertain. Studies®" in dif- 
ferent settings have found little or no change in the number of pulmonary TB diag- 
noses or deaths when comparing sputum smear microscopy and Xpert MTB/RIF, 
a more sensitive molecular test’2. This result may reflect high levels of empirical 
treatment among people who test negative®*. Against this backdrop, a key ques- 
tion remains: if novel diagnostic tests are developed and implemented at scale, 
what impact can we expect on TB epidemiology within populations? 

The impact of TB diagnostics on transmission reflects not only the accu- 
racy of the test, but also the way in which patients with infectious TB interact 
with members of the community and with health systems over time'®”. These 
infection pathways have at least three crucial dimensions: the transmission 
rate (number of transmission events per unit time), the frequency at which 
people contact health systems (often slower in subpopulations with poor ac- 
cess to care), and the probability of starting effective TB treatment after such 
contact'®. Each of these dimensions varies through the duration of infectious- 
ness (from onset to effective treatment, spontaneous recovery or death)". 


Mathematical models can be a useful tool in helping to demonstrate how 
these dimensions relate to the impact of diagnostic tests on TB transmis- 
sion?°2, Figure 1a depicts the simplest, and most commonly used??°, con- 
ceptualization of TB diagnosis in mathematical models so far. In this frame- 
work, on becoming infectious, people with TB experience a series of uniform 
processes. Specifically, they transmit TB at a constant rate, contact the health 
system at a constant rate and undergo a constant probability of successful 
diagnosis (leading to appropriate treatment) with each health-system contact. 
In this framework, the speed at which someone with TB gets treated — and 
the number of people they infect before that treatment — are strongly relat- 
ed to the sensitivity of the diagnostic algorithm. If, for example, people with 
TB contact the health system on average every 6 months with a 50% chance 
of being diagnosed at each visit, the mean duration of infectiousness will be 
1 year (approximately the prevalence/incidence ratio estimated by the World 
Health Organization’). If a more sensitive test (for example, replacing sputum 
smear microscopy with Xpert MTB/RIF?°?”) can increase that probability of 
diagnosis from 0.5 to 0.75, the mean duration of disease, and thus the trans- 
mission per active case, could be cut by one-third. As a result, the projected 
epidemiological impact of a more sensitive diagnostic test in this framework is 
tremendous. This conceptualization of the diagnostic process (constant trans- 
mission, constant health-system contact and constant probability of success- 
ful diagnosis) over time has permeated nearly all projections of expected epi- 
demiological impact from novel diagnostic tests for pulmonary TB — and it is 
almost certainly wrong. Figure 1b shows an alternative conceptualization of the 
TB diagnostic process. In this framework, the transmission rate, frequency of 
health-system contact and probability of successful diagnosis can all change 
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over time”. As an illustration, if patients remain infectious for an average of 10 
months before seeking care and then begin to contact the health system once 
a month?8??, a 50% chance of successful diagnosis per visit would still result 
in a mean duration of infectiousness of 1 year — but increasing the probability 
of diagnosis from 0.5 to 0.75 would only reduce that duration to 11.3 months. 
Worse still, if most transmissions occur in the first 10 months, then even a 
perfect diagnostic test at the health facility could not avert those events. Thus, 
the dynamic trajectories of transmission, health-care seeking and diagnostic 
index of suspicion over the course of TB disease are inextricably linked to the 
epidemiological impact of novel diagnostic tests’?°33 — and overly simple 
depictions of those trajectories may systematically overestimate that impact. 
Adding complexity to these simple frameworks requires additional data to in- 
form amore nuanced understanding of the impact of diagnostic tests. Without 
such data, and models with sufficient flexibility to incorporate them, it is likely 
that projections of the impact of novel diagnostic tests on TB transmission will 
continue to be biased, often dramatically so. 

So far, test accuracy (sensitivity and specificity) — and to a lesser extent, 
feasibility of implementation in peripheral settings — has dominated thinking 
about the ‘value’ of new TB diagnostic tests. However, the impact of any novel 
TB diagnostic test will depend on how the health-care system incorporates it*4, 
as well as on the dynamics of patient interactions with that health-care system 
(Fig. 1). Epidemiologically, therefore, a novel diagnostic assay should be evaluat- 
ed not by its sensitivity and specificity, but rather the extent to which it provides 
diagnostic information beyond earlier tests and practices*° — its incremental 
value. This concept is similar to the classic concept of the expected value of 
diagnostic information (EVDI) promoted by Phelps and Mushlin?°, who also 
highlighted the need to combine the EVDI with estimates of cost or resource 
requirements. Subsequent work has expanded on this concept?”*. In this paper, 
we use principles of infectious-disease modelling and diagnostic epidemiology 
to argue for a change in conceptual approach, from one that has focused primar- 
ily ona test's sensitivity to one that centres on its incremental value. 


METHODS 

Quantifying the incremental value of diagnostic tests for TB. In the con- 
text of TB, there are a number of benefits that new diagnostics could provide. 
These include, but are not limited to, averting TB transmission, averting TB 
morbidity and mortality’, saving money?**°, freeing up health-care capacity 
for other activities, enabling better treatment of other conditions by ruling out 
TB“ and improving patients’ economic situations* or quality of life*?. We fo- 
cus here on the use of novel diagnostic tests as tools to avert TB transmission; 
however, the intention of some tests may be to add value in one or more of 
these other areas — and each test's utility should be evaluated according to 
its intended purpose. 

To appropriately estimate the incremental value of a new diagnostic test 
for TB in terms of transmissions averted, one must consider its relationship to 
the diagnostic pathways outlined in Figure 1. Table 1 lists four defining features 
of TB disease and diagnosis (latency**, gradual symptom onset**“°, reliance 
on sputum’ and concentration of transmission among ‘superspreaders’“®). 
These features highlight a number of potential diagnostic gaps, or elements 
along the TB diagnostic pathway, which, if filled by a novel diagnostic test, 
could generate substantial incremental value. 


Table 1 | Four potential diagnostic gaps in tuberculosis (TB). 


Feature of TB natural history Description 


Resultant source of 


ARINAMINPATHY & DOWDY | DIAGNOSTICS FOR TUBERCULOSIS 


a Individual develops 
active tuberculosis 


1 Diagnostic test 
applied independently 
1 at each visit 


Encounters 
health system 
at constant rate 


Transmission rate 
remains constant 
over time 


b Probability of diagnosis 
(or empiric treatment) 


Individual develops increases with higher 


active tuberculosis Transmission rate 
changes over time 
(with bacillary burden, 


index of suspicion 
cough frequency, 


4 4 Treatment 
and contact patterns) = 


oo 


Health system encounters 
increase in frequency 
as symptoms progress 


Infectious but 
not seeking care 


Stage of TB 
diagnostic pathway: 


Mild, non-specific Prolonged, TB- 
symptoms; seeking care specific symptoms 
Figure 1 | Conceptual diagrams of different tuberculosis (TB) diagnostic models. 
a, The ‘standard’ model. So far, most models of TB diagnosis have assumed that, 
on becoming infectious, individuals with active TB transmit their disease at a 
constant rate, seek care at a constant rate and maintain a constant probability 
of diagnosis and treatment with each care-seeking attempt. In reality, the rate 
at which individuals with active TB transmit disease and seek care, as well as the 
probability of successful diagnosis and treatment, change over time with the 
disease course. This process can be more accurately represented by assuming 
three different stages in the TB diagnostic pathway, as represented at the 
bottom of b. This framework accommodates different types of variation that can 
be crucial in the potential impact of a test. For example, patients might transmit 
their disease at an increasing rate over time as bacillary burden increases, seek 
care more frequently as symptoms progress, and be more likely to receive 
ancillary diagnostic tests (or empiric treatment) as symptoms persist and other 
diagnoses become less likely. 


Specifically, in any given setting, TB transmission may occur primarily from 
people who are not sufficiently ill to seek care*?°°; those who are seeking care, 
but have symptoms (for example, a mild cough) not specific to TB*; or those 
with severe or prolonged symptoms, but who test negative for TB and are there- 
fore not treated (Fig. 1b). Alternatively, most transmission may occur from hard- 
to-reach populations in which the rate for seeking care is low*. Each of these 


Potential representation Diagnostic test capable of 


transmission 


within models filling gap 


Latency 


Difficult microbiological 
confirmation 
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Prolonged latent period 


Early non-specific symptoms (for 
example, cough) 


Bacilli often present in low numbers, 
and only in lungs or sputum; no 
specific antibody 


Transmission concentrated among 
those with poor access to care 


ndividuals who are asymptomatic or 
have only very mild symptoms 


ndividuals who are presenting to care, 
but for syndromic management 


ndividuals who test false negative 
for TB 


ndividuals who lack sufficient access 
to seek care rapidly 


Asymptomatic (or mildly symptomatic) 
infectious state (I,) 


Infectious state with symptoms 
sufficient to drive care seeking, but 
with low index of suspicion for TB (I) 


Active, care-seeking but undiagnosed 
state (I,) 


State with lower care-seeking rate (I’) 


Test to identify who will progress to 
active disease, allowing targeted 
preventive therapy (‘progression 
biomarker’) 


Test to rule out TB (or suggest further 
testing for TB) in people with a cough 
(‘cough triage test’) 


Test to supplant current tests 
with imperfect sensitivity (‘smear 
replacement test’) 


Smear replacement test for use in 


peripheral settings with poor access 
(‘point-of-care test’) 


S61 


DIAGNOSTICS FOR TUBERCULOSIS | ARINAMINPATHY & DOWDY 


Table 2 | Profiles of three illustrative diagnostic tests for tuberculosis (TB). 


Mathematical 
representation 


Descriptive profile 


Illustrative name 
(see Table 1) 


Approximate number 
needed to screen to identify 
1 additional case, typical 
high-burden setting 


Avertable transmission load 


This test could be applied to a general 
population to identify people who 
would subsequently develop active 
TB; these people could be treated with 
highly effective preventive regimens 


Progression biomarker 


This test could be applied to all people 
presenting to care with a cough, even 
if suspicion for TB was low — those 
testing positive could have a highly 
sensitive test performed 


increases 


This test would allow highly sensitive 
diagnosis among those already seeking 
care with high suspicion of TB 


Smear replacement test 
increases 


As for the smear replacement test, but | As above 
one that is possible to deploy in the 


poor-access population 


gaps suggests a potential diagnostic solution that would have high incremental 
value. This may be a test to predict progression to active TB (and thus allow tar- 
geted preventive therapy), one optimized for diagnosing combinations of symp- 
toms (such as cough and fever), one that is simply more sensitive, or one that is 
more deployable to peripheral and informal settings (Table 2)°*. We incorporate 
these possibilities more formally into a mathematical model of TB transmission. 


Model description. Figure 2 presents a simple, illustrative model of TB diagno- 
sis and transmission that expands the constant care-seeking approach shown 
in Figure 1a. In this model, the population is divided into different compartments 
that reflect the natural history of TB and incorporate both the stages of the di- 
agnostic pathway shown in Figure 1b and the corresponding diagnostic gaps 
listed in Table 1. Movement of people between these compartments can be rep- 
resented by a system of ordinary differential equations, with rates of transition 
between compartments (for example,/,, the rate of initiating care seeking) that 
reflect the inverse of the mean duration of time spent in each phase (for exam- 
ple, the mean duration between onset of infectiousness and beginning to seek 
care). As most of these durations are currently unknown (and differ from one 
setting to the next), we assume — for the purposes of illustration — a popula- 
tion that is at equilibrium, with values of TB incidence, prevalence and mortality 
that reflect a setting of moderate TB burden (see Supplementary Information). 
We then use this simplified model to estimate, in this hypothetical setting, 
the incremental value of diagnostic tests with different profiles under differ- 
ent assumptions about the relative importance of each diagnostic gap. This 
simplified model divides the population of individuals with active TB into three 
categories (Figs 1b and 2): those who are infectious, but who are not actively 
seeking care (/,), those who have early symptoms that trigger less frequent 
care seeking and who have a lower probability of correct diagnosis/empiric 
therapy (/,), and those who have characteristic and prolonged symptoms that 
trigger frequent care seeking and a likely diagnosis with each attempt (I). We 
also assume a general population and a sub-population (I’, set at 10% for the 
purposes of illustration) with ‘poor’ access to care whose rate of care seeking is 
a specified fraction (k, set initially at 0.5) of the rate in the general population. 

Importantly, this model captures the three dynamic processes of trans- 
mission, health-care seeking and empiric treatment shown in Figure 1b. First, 
the rate of transmission (the probability of a ‘contact’ resulting in TB transmis- 
sion, multiplied by the number of potential contacts per unit time) can vary 
over time. For example, 8, (the number of transmissions per person-month 
spent in the asymptomatic infectious state /,) may be higher than f, and , 
(transmission rate from the symptomatic states |, and I,), because the contact 
rate with susceptible individuals may be highest early in the disease course 
(suggested by the high prevalence of TB infection in contact investigations). 
Alternatively, the inverse might be true because the bacillary burden grows 
over time. We capture this in the concept of the ‘transmission load’, which 
we define as the proportion of transmission events at the population level that 
occur in each of these three stages. Second, the rate of seeking care can in- 
crease over time as symptoms progress. Third, the probability of diagnosis 
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A proportion of individuals with 
latent TB infection are returned to 
the uninfected state if successfully 
identified and treated 


The probability of successful diagnosis 


in the early symptomatic period 


The probability of successful diagnosis 
in the late symptomatic period 


100-500 (1/|lifetime probability of 
incident TB x probability of completing 
effective preventive therapy]) 


20-100 (prevalence of active TB among 
all patients with a cough) 


10-20 (prevalence of smear-negative 
active TB among those with smears 
currently performed) 


10-20 (prevalence of smear-negative 
active TB among those with smears 


Pre-care seeking, mild symptoms 
and prolonged symptoms (general 
population only) 


Mild symptoms and prolonged 
symptoms (general population only) 


Prolonged symptoms (general 


population only) 


Prolonged symptoms (low-access 
population) 


currently performed) 


with each care-seeking attempt can also increase over time, as symptoms be- 
come more suggestive of underlying TB disease°®. These two processes can 
be combined into a single ‘rate of successful diagnosis and treatment’ (d) that 
increases over time from d, to d,to d.. 

We explore three hypothetical settings for how transmission varies during 
the course of TB disease: late diagnostic gap, in which the transmission rate 3 is 
four-fold higher at each subsequent stage of TB disease (for example, constant 
contact rate with susceptible individuals with increasing bacillary burden); early 
diagnostic gap, in which f falls by a factor of four at each stage (for example, 
pool of susceptible individuals shrinks over time as household members and 
other close contacts are exposed); and high access disparity, in which those 
with least access to care are assumed to have a rate of diagnosis and treatment 
that is 10% (rather than 50%) that of the general population. Each setting is cal- 
ibrated to have the same level of TB incidence (see Supplementary Information). 

In the context of each of these settings, we explore the potential incremental 


General population Poor access to care 


U, uninfected U’, uninfected 


vy v 
> L, latent infection, > Loe latent infection, 
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4 seeking care e seeking care 
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TB-specific TB-specific Transmission rate f, 
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Figure 2 | Model structure relating diagnostic pathways to transmission load. 

A representation of a simple mathematical model that incorporates the three 
stages of diagnosis shown in Figure 1b. Relative rates of transmission, B, can vary 
from one stage to the next, with y representing the inverse of the mean duration 
of each stage at the population level. Upward arrows denote removal of cases 
through diagnosis and curative treatment, d, as well as spontaneous resolution 
(not shown, for simplicity). We also assume a fixed proportion of the population 
(10% in the base case) have ‘poor’ access to care, defining an ‘access disparity 
parameter’ k to reflect the relative rates of diagnosis in this population. At 
baseline, we assume that k = 0.5. TB, tuberculosis. 
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i Prolonged symptoms (low access) i Prolonged symptoms (general) 


i Mild symptoms (low access) I Mild symptoms (general 


GD Pre-care seeking (low access) i Pre-care seeking (general) 


1005 


75- 


50- 


Transmission load (%) 


255 


Late diagnostic gap Early diagnostic gap High access disparity 


Figure 3 | Tuberculosis (TB) transmission load under three alternative scenarios. 
The size of each bar denotes the transmission load, defined as the percentage 

of all tuberculosis transmission that occurs within a given diagnostic stage. 
Transmission from the general population is shown in darker colours, with 

that originating from the ‘poor-access’ population shown in lighter colours. 
Interrupting transmission at a given stage also averts transmission in subsequent 
stages (for example, diagnosing a case in stage |, also averts the transmission 
that this case could have caused in stage 1); this effect can be calculated as the 
sum of the transmission load in the relevant stage plus all subsequent stages 
within that population. 


value of four illustrative diagnostic tests: a ‘progression biomarker’ that predicts 
progression from latent to active TB (to facilitate preventive therapy)*”, a ‘triage’ 
est that facilitates syndromic diagnosis of people presenting with cough*®; a 
more sensitive ‘replacement test’ to supplant current sputum-based confirma- 
ory tests for TB’; and a ‘point-of-care test’ that can replace sputum smear in 
peripheral settings®°, thereby (unlike the other three tests) being accessible to 
hose with poor access to care. These tests, along with their mathematical rep- 
resentation in our simplified modelling framework, are summarized in Table 2. 

We focus on comparisons between these types of diagnostic tests when 
hey are added to the standard of care. To illustrate the transmission contri- 
butions of different groups, we assume that progression biomarker, triage and 
replacement tests are deployed in the general population, whereas the point- 
of-care test is deployed in the poor-access population. We discuss below how 
different diagnostic gaps might cause each of these illustrative tests to be pre- 
ferred over the others, thereby emphasizing the importance of quantifying (or 
at least estimating) the diagnostic gap in any given setting. 


Incorporating resource constraints. Ultimately, discussions of a new diag- 
nostic test's incremental value must also consider any constrained resources 
— whether economic or otherwise — that would be required to implement the 
test. One method for evaluating the incremental value of a diagnostic test in a 
given setting is to first identify any constrained resources required for test im- 
plementation. The additional resources required to change from the existing 
standard of care to an algorithm that augments that standard of care with the 
new diagnostic test can then be estimated (the incremental resource require- 
ment)°*!. Finally, this is combined with estimates of the incremental number of 
transmissions averted under this augmented algorithm, relative to the stand- 
ard of care (incremental impact). Thus, tests that aim to avert TB transmission 
can be compared using an inverse incremental cost-effectiveness ratio: (in- 
cremental transmissions averted)/(incremental resource requirement), or 


(T, - T,)/(R, - R,) (1) 
where 1 denotes the presence of the new test and O denotes its absence. 
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BOX 1 | ESTIMATING THE INCREMENTAL VALUE OF 
TUBERCULOSIS DIAGNOSTIC TESTS, PER UNIT OF 
CONSTRAINED RESOURCES. 


In comparing diagnostic tests for tuberculosis (TB), it is important to 
consider both the incremental impact (presented here as transmis- 
sions averted) and the incremental resource requirements associat- 
ed with implementing each new test. The following considerations 
are not meant to be an exhaustive list, but a demonstration of some 
of the complexity that must be considered (and corresponding data 
collected) to properly evaluate the incremental value of diagnostic 
tests for TB in the setting of constrained resources. Illustrative con- 
siderations therefore include: 


Determinants of incremental impact 

(incremental transmissions averted) 

1. Epidemiological setting/existing diagnostic gaps 

2. Diagnostic test characteristics (accuracy, diagnostic gap 
targeted) 

3. Existing diagnostic algorithms (incremental role of the new test) 


Determinants of incremental resource 
requirements 


1. Enumeration of constrained resources 
2. Number of tests needed to identify one additional case 
3. Per-test outlay of constrained resources (‘unit cost’) 


In settings in which TB diagnostic tests are being compared with other 
interventions (for example, TB treatment or HIV diagnosis), transmissions 
averted can be converted into measures of health utility (such as disabili- 
ty-adjusted life years, or DALYs, averted)® to estimate resources in terms of 
economic costs and to report this incremental value as an incremental cost- 
effectiveness ratio. However, when only comparing diagnostic tests with the 
same primary aim (to avert transmission), the formulation of incremental val- 
ue in Equation 1 may be more useful; this formulation places the emphasis on 
impact rather than cost and does not require additional model assumptions 
to convert transmissions into DALYs or constrained resources (for example, 
human capacity) into economic costs. Therefore, we use this more direct for- 
mulation in our model results. 


RESULTS 

Incremental value of TB diagnostic tests. Figure 3 shows how the transmis- 
sion load at equilibrium (the proportion of population-level transmission con- 
tributed by each stage) differs in each transmission scenario. For example, 
in the late diagnostic gap scenario, 35% of all transmission originates from 
individuals with mild symptoms in the general population, whereas this per- 
centage falls to 5% in the early diagnostic gap scenario. Importantly, averting 
transmission in the earlier stages (for example, preventing a case from de- 
veloping, even before to care seeking) also averts that transmission in later 
stages — seen in Figure 3 by the combined value of the stacked bars. Thus, for 
example, preventing all transmission in the latter two care-seeking stages in 
the general population would avert 51% (35% + 16%) of all transmission in the 
late diagnostic gap scenario, compared with only 5% in the early diagnostic 
gap scenario — and a diagnostic test targeting these stages (for example, the 
‘cough triage’ test) might be expected to have greater impact in settings that 
more closely resemble the late diagnostic gap scenario. 

A notable feature of the late diagnostic gap scenario is that, despite trans- 
mission being substantially more intense® in the prolonged-symptom stage 
1,16 times greater per unit time than in the pre-care-seeking stage |,), the 
contribution of this stage to transmission remains relatively modest. This is 
largely due to the relatively short time that individuals spend in this late symp- 
tomatic stage. We assume here that, under the standard of care (typically 
using sputum smear microscopy), individuals are diagnosed on average after 
1 month in this late symptomatic stage, compared with 6 months spent in the 
asymptomatic stage. However, the high access disparity scenario shows the 
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Figure 4 | Maximum incremental impact per unit of constrained resources 

for four illustrative diagnostic tests in three alternative scenarios. On the 

y-axis is the maximum incremental impact (number of tuberculosis (TB) 
transmissions averted) for each of four illustrative TB diagnostic tests, divided 
by the incremental resources required to implement each test (Equation 

1). All measures are benchmarked to an incremental impact of 1 for the 

smear replacement test in the general population. Here, we assume that the 
constrained resources are simply proportional to the number of people needed 
to test to diagnose one additional case of active TB. The maximum incremental 
impact is the number of transmissions that would be averted if diagnosis averts 
all transmission associated with a given patient stage in Figure 2. Accordingly, 
the results presented here should be interpreted as an upper bound that are 
illustrative of the role of diagnostic gaps in each stage. In the cases illustrated 
here, the ‘progression biomarker’ (which identifies individuals at risk for 
progression to active TB) is clearly favoured in the early diagnostic gap scenario, 
whereas the point-of-care test (which replaces the smear test in the poor-access 
population, and is deployed only in the poor-access population) is strongly 
favoured in the high access disparity scenario. 


potential importance of the late symptomatic stage when the rate of diagnosis 
is diminished. Here, transmission in the late symptomatic stage is sufficiently 
strong for 55% of the transmission load to occur from a high-risk (and symp- 
tomatic) subgroup that accounts for no more than 10% of the total population 
— a level of disproportionate transmission that is only modestly higher than 
has been suggested in some settings®. 


Incremental value of new diagnostic tests under constrained resources. Fig- 
ure 4 shows results for the incremental value (Equation 1), comparing diagnostic 
tests that target different stages and under different transmission scenarios. For 
the denominator of Equation 1, Figure 4 assumes a simple, illustrative example 
for which the constrained resource is the number of individuals who can be test- 
ed with a novel test, irrespective of the test type or its unit cost (see Supplemen- 
tary Table 2 for further details). This might, for example, reflect a setting in which 
donor funding could be obtained to implement a new test, but the equipment or 
human resources available to conduct those tests were extremely limited. For 
the numerator of Equation 1, Figure 4 assumes the maximum number of trans- 
missions averted if the diagnostic test in question (such as the cough triage test) 
could avert all of the transmission occurring in the stage of disease targeted 
(for example, I, mild symptoms, but seeking care). In practice, owing to factors 
such as imperfect sensitivity and incomplete population-level implementation, 
an actual test would only avert a portion of that maximum transmission load; 
the actual incremental value of each test would therefore be proportionally low- 
er. Thus, in dividing the maximum incremental impact by the fixed incremental 
resources available, Figure 4 compares the maximum incremental value for each 
idealized test type, leaving it to subsequent work to estimate what proportion of 
that maximum could actually be achieved by a given test in practice. 

Figure 4 illustrates that — where the primary diagnostic gap is early in 
the disease course — the maximum incremental value for tests that target 
earlier stages is higher than that of the smear-replacement test. By contrast, 
when the primary diagnostic gap is late in the disease course, the maximum 
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Figure 5 | Maximum incremental value per unit of constrained resources after 
incorporation of a cost function. The same illustrative tests are evaluated in 

the same alternative scenarios as in Figure 4, but in this case we apply a cost 
function that accounts for the fact that diagnosis earlier in the disease process, 
or among low-access populations, is generally more resource-intensive on a 
per-test basis (see Supplementary Table 2 for full assumptions). After considering 
this cost function, the ‘progression biomarker’ is no longer clearly favoured in 
the early diagnostic gap scenario, and the degree to which the point-of-care test 
is favoured over the smear replacement test in the high access disparity scenario 
is reduced by the same factor (8 in this case) by which the cost per person 
screened in the low-access population exceeds that in the general population. 
As in Figure 4, all measures are benchmarked to an incremental impact of 1 for 
the smear replacement test in the general population. 


incremental value of the later-stage diagnostics is far greater (as represented 
by their markedly higher incremental value). Notably, where transmission is 
concentrated among a population with particularly poor access to care, the 
maximum incremental value for a test that can be implemented in this pop- 
ulation can be considerably higher than for any other test (as in the access 
disparity scenario). 

Figure 5 shows an alternative scenario for the denominator of Equation 1in 
which the limiting resource is financial (for example, a fixed amount of money 
available), assuming that the cost per test is higher when applied earlier in the 
diagnostic pathway. (For example, it is more costly to screen a patient for TB 
in a prevalence survey® than it is in a clinic®”.) The ‘unit cost’ of a test is also 
assumed to be higher per person when applied in the poor-access population 
(see Supplementary Table 2), as these individuals are assumed to be harder 
to reach than the general population. In the early diagnostic gap scenario, for 
example, considering this unit cost dramatically lowers the maximum incre- 
mental value of the biomarker test that could be achieved per unit of the con- 
strained resource, relative to the cough triage and smear replacement tests. As 
a result, under this alternative resource constraint, the cough triage test, rather 
than the biomarker, would be preferred. 


DISCUSSION 

In evaluating novel diagnostic tests for TB, it is crucial that we move beyond sim- 
ple considerations of elements such as sensitivity, specificity and turnaround 
time — and instead begin to consider the incremental value of diagnostic tests 
that fit certain profiles. We use a simple mathematical model to demonstrate 
key trade-offs in an illustrative setting. This work demonstrates how diagnostic 
tests for TB can be quantitatively assessed in terms of their incremental value 
(incremental impact divided by incremental resource requirement), and moreo- 
ver how this incremental value can vary from one setting to the next. 

The prevailing diagnostic gap in a given setting has a profound effect on 
the potential incremental impact of each diagnostic test. When most trans- 
mission occurs before patients begin to seek care, diagnostic tests that require 
patients to access the health system are unlikely to have substantial epide- 
miological impact; thus, in the early diagnostic gap scenario (Fig. 3) the only 
diagnostic test capable of averting the bulk of the transmission load is the 
prevention biomarker. Similarly, when a substantial disparity exists between 
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the high-risk and general population, diagnostics that cannot be implemented 
in the high-risk group are limited in their potential. 

Consideration of incremental impact must also include consideration of in- 
cremental resource requirements, however. For example, the resources required 
to avert a transmission are generally much greater when diagnostic tests are 
performed early in the disease course®, or in hard-to-reach populations. As a 
result, those diagnostic tests with the largest maximum incremental impact 
may also be those that require the most resources. In estimating the incremen- 
tal resource requirement of a given test, it is important to consider the resources 
for TB control that are constrained in a given setting. In many cases, these con- 
strained resources will be purely financial, but in others, there may be limita- 
tions on the availability of trained staff or laboratory capacity to perform certain 
tests®®. The per-test incremental outlay of the most constrained resources is 
therefore also likely to vary from one setting to the next. Ultimately, the incre- 
mental value of a TB diagnostic test depends not on sensitivity and specificity, 
but also on multiple factors that will vary from one system to the next (Box 1). 
For any setting, all six of the elements in Box 1 should be evaluated to help to 
identify the type of novel test that is likely to have the greatest incremental value 
(avert the most TB transmission events, given the constrained resources). As 
assessments of these factors are performed across a variety of settings, consen- 
sus may emerge as which tests should be prioritized for development. 

Unfortunately, we currently lack the empirical data in most settings to 
make such an informed assessment. Specifically, it is likely that different trans- 
mission loads and diagnostic gaps — early, late or among high-risk subpopu- 
lations — predominate in different settings, and that resource constraints vary 
widely from one setting to the next. How can this data gap be closed? 

First, we require better evidence regarding how novel diagnostic tests 
function when implemented under field conditions. Such data would allow 
us to estimate the proportion of any diagnostic gap that a new TB test could 
close, as well as the number of tests required to make one additional diagno- 
sis. Unfortunately, most diagnostic tests are evaluated primarily in well-fund- 
ed trials and demonstration studies, without good evidence of how they per- 
form in the real world. For example, Xpert MTB/RIF was recommended on 
the basis of high-quality data about its accuracy and cost-effectiveness under 
controlled conditions and in a large field trial?°; however, emerging evidence 
has suggested that, in many settings, the characteristics of Xpert may be dif- 
ferent when implemented in the field — including its sensitivity®, calibration’, 
positive predictive value (owing to low pre-test probability)”, and accuracy for 
rifampin resistance”. To make accurate assessments of the incremental value 
of diagnostics, we should collect such data early after launch, and update ex- 
pectations and recommendations as those data become available. 

Second, we need better data on the performance of existing tests, includ- 
ing clinical judgement. These data would enable us to evaluate the incremen- 
tal number of transmissions that a novel test might be able to avert, relative to 
the existing standard of care. A series of recent high-quality studies suggests 
that, when patients present with symptoms that are highly suggestive of TB 
in upper-middle income settings (for example, South Africa and Brazil), the 
probability of empirical diagnosis is reasonably high®™ — but that a large num- 
ber of people may be presenting to care with a cough without TB ever being 
considered®. Such studies are crucial to understand the likely diagnostic gaps 
for TB, but unfortunately, very few such analyses have been performed in set- 
tings with fewer resources (for example, most of sub-Saharan Africa’*” and 
Southeast Asia) where empirical diagnosis rates (and the capacity to imple- 
ment novel diagnostic tests) may be much lower. Characterizations of relative 
TB transmission from high-risk populations (akin to the ‘low-access’ popula- 
tion in Figure 3) compared with the general population are also sparse”, and 
could potentially be informed by better use of surveillance data’®. 

Third, and perhaps most challengingly, we need to prioritize characteri- 
zations of the transmission load and diagnostic gaps in a variety of settings. 
If we can describe the prevailing transmission loads in any given setting, we 
can then quantify the maximum incremental impact (transmissions averted) 
of any diagnostic test in that setting. Ultimately, for any setting, one should 
be able to delineate what proportion of the transmission load in each of the 
phases of TB (pre-care seeking, mildly symptomatic and prolonged sympto- 
matic in the general population and in high-risk groups) is being averted using 
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existing tests, and therefore what proportion might still be amenable to imple- 
mentation of a novel diagnostic. Molecular characterization of TB (for exam- 
ple, through whole-genome sequencing’”’) in entire populations is becoming 
available and can be linked to conventional epidemiological investigations (for 
example, through contact investigations’®) using increasingly discriminatory 
tools for analysis and data collection’”’. Thus, it may become possible to tri- 
angulate an infectious individual's onset of symptoms, initiation of care-seek- 
ing activities and specific transmission events. Studies that merge data on 
transmission, contact patterns, symptom histories, care-seeking patterns and 
interactions with the health-care system on a population level should be pri- 
oritized in this regard. In the meantime, simple investigation of surveillance 
data can help to identify geographic hotspots of transmission, and operational 
analyses of diagnostic test implementation can demonstrate where diagnoses 
are probably being missed. Although estimating the duration of an infectious 
episode poses significant challenges, household cohorts using currently avail- 
able tools could cast some light on the ‘transmission load’ that occurs early in 
the clinical course®°*", 

Finally, we need better investigations of constrained resources in specif- 
ic settings to enumerate the resources that are genuinely constrained, and to 
quantify those resources per test performed (as the equivalent of a unit cost). 
Although conventional economic evaluations of interventions against diseases 
such as TB implicitly consider money to be the most constrained resource, oth- 
er studies in low-income settings have shown that human resources, laborato- 
ry capacity, regulatory infrastructure or ability to implement new interventions 
may be the key limiting factors®. This may be especially true in the modern era 
of direct assistance for health — which may supply money, but not resources 
in the form of trained personnel®. An understanding of the most constrained 
resources in any given setting must then be merged with data on the number of 
tests required to identify an incremental case, as well as the per-test resource 
outlay, for any given novel diagnostic test. Only if we truly understand the re- 
sources that are most constrained in a given setting, as well as the resource 
outlay for each type of diagnostic test, can we identify the diagnostic tests that 
will optimize epidemiological impact under existing resource constraints. 

Ultimately, the only way to end TB is to diagnose and treat people with TB 
before transmission occurs — novel diagnostics are an essential component 
of any strategy with this aim. If we are to succeed in that endeavour, we must 
think of, and quantify, those tests not just in terms of sensitivity, specificity and 
turnaround time, but rather in terms of their incremental value across a variety 
of epidemiological settings. We present a framework for estimating this incre- 
mental value that also highlights the need for additional data in order to inform 
more appropriate prioritization of novel TB diagnostic tests, across settings 
that may differ in their existing diagnostic gaps and resource constraints. As 
we continue to develop diagnostic tests with the goal of curbing TB trans- 
mission, we must think beyond accuracy and consider the broader context of 
patient behaviour, health systems and TB natural history. 
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Sustainable HIV treatment in Africa through 
viral-load-informed differentiated care 


Working Group on Modelling of Antiretroviral Therapy Monitoring Strategies in Sub-Saharan Africa* 


There are inefficiencies in current approaches to monitoring patients on antiretroviral therapy in sub-Saharan Africa. Patients 
typically attend clinics every 1 to 3 months for clinical assessment. The clinic costs are comparable with the costs of the drugs 
themselves and CD4 counts are measured every 6 months, but patients are rarely switched to second-line therapies. To ensure 
sustainability of treatment programmes, a transition to more cost-effective delivery of antiretroviral therapy is needed. In con- 
trast to the CD4 count, measurement of the level of HIV RNA in plasma (the viral load) provides a direct measure of the current 
treatment effect. Viral-load-informed differentiated care is a means of tailoring care so that those with suppressed viral load 
visit the clinic less frequently and attention is focussed on those with unsuppressed viral load to promote adherence and timely 
switching to a second-line regimen. The most feasible approach to measuring viral load in many countries is to collect dried 
blood spot samples for testing in regional laboratories; however, there have been concerns over the sensitivity and specificity of 
this approach to define treatment failure and the delay in returning results to the clinic. We use modelling to synthesize evidence 
and evaluate the cost-effectiveness of viral-load-informed differentiated care, accounting for limitations of dried blood sample 
testing. We find that viral-load-informed differentiated care using dried blood sample testing is cost-effective and is a recom- 
mended strategy for patient monitoring, although further empirical evidence as the approach is rolled out would be of value. We 


also explore the potential benefits of point-of-care viral load tests that may become available in the future. 
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onitoring people on antiretroviral therapy (ART) cost-effectively is 

crucial for the sustainability of ART programmes in sub-Saharan Af- 

rica. In most countries, patients are required to attend clinics every 
1 to 3 months for clinical assessment. The cost of which — for personnel, in- 
frastructure and maintenance — is comparable with costs of the antiretro- 
viral drugs themselves'?. In most settings, patients are monitored by a CD4 
count measurement every 6 months with clinical observation at least every 
3 months, but they are rarely switched to second-line regimens. A reduction 
in visit frequency for patients who do not require an adherence intervention 
or a switch to second-line ART would benefit programmes by reducing costs 
and benefit patients by saving travel costs and time away from work, possibly 
lowering the rate of default from care’. 

The biomarker that most directly measures the ongoing effect of ART is 
the HIV RNA level in plasma (the viral load). If viral load is suppressed, this 
indicates that the patient is adhering to the drug regimen and does not carry 
drug-resistant virus. Data from high-income countries suggest that after 1-2 
years of ART with viral-load suppression the visit frequency can be reduced. 
If the viral load is not suppressed this suggests that there is a need for im- 
proved adherence and/or a switch in regimen. In most countries in sub-Saha- 
ran Africa, measurement of viral load is not widely available. Quantification of 
HIV RNA requires sophisticated facilities and skilled staff and the costs have 
been high, although costs have substantially decreased in the past 5 years®°, 
Modelling studies have indicated that there is a benefit to viral-load monitor- 
ing compared with monitoring strategies based on the CD4 count or clinical 
observation’"®, but viral-load monitoring has not been found to be cost-ef- 
fective”'°, owing to the cost of viral-load tests and second-line regimens. 
Currently, the most feasible approach to begin to measure viral load in many 


countries is to collect samples as dried blood spots (DBS). DBS are stable at 
ambient temperature and can be prepared from capillary whole blood, elim- 
inating the need for phlebotomy services". Using existing networks for early 
infant HIV diagnosis, they can be transported to a regional or national labora- 
tory with results subsequently returned to the clinic by, for example, mobile 
phone text messaging. However, the presence of cells and low sample volume 
in DBS specimens means that sensitivity and specificity for detecting wheth- 
er the level is above the 1,000 copies per millilitre threshold that is used to 
define viral suppression are imperfect and it is unclear if the approach is ad- 
equate®"*7, Looking to the future, it is anticipated that point-of-care (POC) 
tests — tests that enable a decision to be made about patient management 
during the same visit that the sample is taken — may become widely availa- 
ble?®, and this may result in greater accuracy than the use of DBS, as well as 
facilitating rapid action based on the test result. 

In the light of these issues, we consider how HIV treatment programmes 
in low-income countries in sub-Saharan Africa should monitor patients on ART 
in a way that is likely to lead to the greatest population health gains from the 
limited resources available?®. We update a model previously used to compare 
monitoring strategies, incorporating new lower costs and the potential for vi- 
ral-load-informed ‘differentiated care’ based on reducing clinic visit costs by 
reducing visit frequency among virally-suppressed individuals?°". 


METHODS 

The HIV Synthesis transmission model is an individual-based stochastic mod- 
el of heterosexual transmission, natural history, clinical disease and treatment 
of HIV infection that incorporates use of specific drugs, resistance mutations 
and adherence®3?°°, 


*List of working group members and their affiliations appear at the end of the paper. Correspondence should be addressed to: A. P. e-mail: andrew.phillips@ucl.ac.uk. 
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Figure 1| Overall programme costs. Costs in US$m per 3 months, 
according to monitoring strategy (mean 2015-2034, discounted at 3% per 
annum from 2015). ART, antiretroviral therapy; VL, viral load; WHO, World 
Health Organization. 


ART programme and monitoring strategy modelling 

We based our simulated population around that of Zimbabwe and the under- 
lying model is described in detail in the Supplementary Information. We as- 
sumed that up to 2015 a CD4 count monitoring strategy has been used. Then 
we considered the introduction of plausible alternative monitoring strategies 
and predicted outcomes over 20 years to 2035. The seven main monitoring 
strategies compared (Table 1) cluster into three main types: clinical observation 
(with or without targeted CD4 count or viral-load testing in those with clini- 
cal disease), regular CD4 count monitoring or regular viral-load monitoring. In 
the case of viral-load monitoring, we simulate a strategy consisting of off-site 
laboratory-based testing of DBS using the World Health Organization (WHO) 
recommended 1,000 RNA copies (cps) ml! threshold. Viral load measured 
as <1,000 cps mI" in the past year is assumed to lead to a reduction in non- 
ART programme costs owing to fewer clinic visits by people on first-line ART. 
Measurement of viral load 1,000 cps ml"! or more is assumed to lead to a tar- 
geted adherence intervention, which increases adherence in some people. We 
refer to this strategy as viral-load-informed differentiated care. Regardless of 
the monitoring strategy used, once strategy-specific failure criteria are met 
we assume a probability of switching to a second-line regimen of 0.5 per 3 
months. In practice, current switch rates are lower than this, even in settings 
with viral-load monitoring in place*”*°; we chose this higher probability, howev- 
er, to be able to discern differences in effects between strategies. In sensitivity 
analyses we consider a situation in which switch rates are zero. Throughout, we 
assume monitoring is performed only for people on first-line ART. 

We model decreased precision of DBS for measuring viral load by con- 
sidering the presence of HIV RNA in cells and the small sample volume??>?°, 
such that the sensitivity and specificity of the measure for detecting viral load 
of >1,000 cps ml"! compared with measurement on a plasma sample are 86% 
and 92%, respectively (compared with values ranging from 81% to 85% sensi- 
tivity and 88% to 99% specificity® for most assays); we consider other values 
in sensitivity analysis. We also assume that there is a 3-month delay in the 
clinician acting on the result, even though results are generally returned to the 
clinic quicker than this. 

Sensitivity analyses were performed to consider: possible differences in 
population adherence profile, potential increases in sexual behaviour, chang- 
es in effectiveness of the adherence intervention triggered by viral load being 
>1,000 cps ml", a policy of initiation of ART at diagnosis, that visit frequency 
might be reduced in those with a CD4 count of >350 per mm? in the past year, 
a zero rate of switch to second-line regimens, differences in the baseline prev- 
alence of HIV, differences in the proportion on ART, differences in the rate of 
ART interruption if visit frequency has been reduced owing to viral load being 
<1,000 cps ml"|, a higher discount rate of 5% rather than 3%, and a 10-year 
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Figure 2 | Cost-effectiveness. Cost-effectiveness plane showing clinical- and 
CD4-based monitoring strategies along with viral-load-informed differentiated 
care using dried blood spots. DALYs, disability-adjusted life years; ICER, 
incremental cost-effectiveness ratio. 


rather than a 20-year time horizon. In addition, we considered the effects of 
whether whole blood or plasma is used, whether the test is done in a central 
laboratory and incurs the 3-month delay in action or is done at POC with no 
delay, the threshold to define failure (200, 1,000 or 5,000 cps ml-', which is 
only assessed in the context of plasma), and the frequency of measurement 
(every 6 months, annually or every 2 years). 

Last, we focussed on the specific comparison between viral load using 
DBS and using a plasma-based POC test to quantify the extent of various po- 
tential advantageous features of a POC test on its cost-effectiveness in rela- 
tion to use of DBS. It is important to note that we are considering potential 
features of a POC test — it is not clear that such features can be delivered, 
so this analysis is directed mainly towards developers and should not be in- 
terpreted as indicating that POC tests will prove to have any of these advan- 
tageous features. This is why we chose to consider a plasma-based POC test, 
although in reality it may be more likely that a whole-blood-based test is used 
for many POC tests, to avoid a plasma separation step. Further details of how 
all these aspects are modelled are provided in the Supplementary Information. 


Economic analysis 

Our objective is to maximize population health — the health benefits as- 
sociated with the alternative monitoring strategies are estimated using the 
metric disability-adjusted life years (DALYs) averted — with the available 
resources. A health sector perspective has therefore been adopted for the 
analysis. Direct and indirect costs incurred by the patients are excluded. Both 
costs and health benefits were discounted to present value using a 3% per 
annum discount rate in our base case. The expected costs and health out- 
comes associated with each monitoring strategy can be compared to indicate 
which is likely to represent the best value from the available resources. The 
cost-effectiveness threshold for a country represents the opportunity costs of 
resources required to fund the intervention, in terms of the health gains that 
those resources could generate if used for alternative purposes in the public 
health-care system". As such, the threshold for a country is not readily appar- 
ent, but US$500 per DALY averted is likely to be at the upper end based on 
the magnitude of benefit if the resources were spent on other programmatic 
priorities such as eliminating coverage gaps for ART if these are large*?. The 
modelling results are intended to inform decisions in sub-Saharan African 
countries such as Zimbabwe classified as low and low-middle income using 
the World Bank country classifications that have typically struggled to scale- 
up viral-load monitoring?". The analyses may also be informative for higher 
income countries in the region (such as South Africa and Botswana) that have 
already scaled up viral-load monitoring, but are seeking more efficient ways 
to deliver ART. 
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Table 1 | The seven main monitoring strategies modelled listed by the short name given to the strategy. 


No 
monitoring 


Clinical 
monitoring 


Clinical 
monitoring 
viral load 
confirmation 


Clinical 
monitoring 
CD4 count 
confirmation 


CD4 count 
monitoring 
(WHO) 


CD4 count 
monitoring 
(<200) 


Viral-load-informed 
differentiated care 
using DBS 


What the monitoring NA Check on presence Check on presence 
strategy entails for people of symptoms every 3 of symptoms every 3 
on first-line ART months months 
Measure viral load 
if WHO 4 condition 
diagnosed or two 
WHO 3 conditions 
diagnosed in 1 year 
NA WHO 4 condition Viral load >1,000 
diagnosed or two cps ml* 
WHO 3 conditions 
diagnosed in 1 year 
Reduction in clinical visit None None None 


frequency and hence 
reduction in non-ART 
programme cost* 


Check on presence 
of symptoms every 3 
months 


Measure CD4 count 


6-month CD4 count 


If failure criteria 
seem to be met, re- 
measure to confirm 


12-month CD4 count 


If failure criteria 
seem to be met, re- 
measure to confirm 


Viral load measure using 
DBS at 6 months, 12 months 
and every 12 months 
thereafter 


if WHO 4 condition (confirmatory CD4 (confirmatory CD4 If viral load is >1,000 cps 
diagnosed or two count) count) ml? then provide adherence 
WHO 3 conditions intervention and re-measure 
diagnosed in 1 year. viral load 3 months later 
(confirmatory viral load 
measure) 
No CD4 count 
measurements 
CD4 count <250mm* —CD4 count less than CD4 count <200 Viral load >1,000 cps ml* 


mm®? after more than 
3 years on ART. CD4 


pre-ART baseline 
or CD4 count <100 


in confirmatory viral load 
measure 


mm*inconfirmatory <100 mm®% after more 
CD4 count than 1 year on ART 
in confirmatory CD4 
count 
None None None Yes, when most recent 


viral load <1,000 cps ml“, 
measured in the past year 


*Assuming 3-monthly clinical visits for all strategies except under viral-load-informed differentiated care when the most recent viral load <1,000 cps ml, measured in past year. More frequent clinical visits than 
once every 3 months are not modelled as the model advances in 3-month periods. ART, antiretroviral therapy; Cps, copies; DBS, dried blood spot; WHO 4, World Health Organization stage 4 condition. 


Disability weights to calculate DALYs averted were derived from a recent 
comprehensive study**. Unit costs (in US$ at 2014 prices) are detailed in the 
Supplementary Information. In brief, costs of viral-load assays are assumed to 
be $22. This is a fully-loaded cost, counting all components such as reagents, 
costs of equipment, human resources, buildings, and so on (see Supplementa- 
ry Information). Because POC viral-load tests are not yet available it was not 
possible to calculate the cost so we assumed a similar cost of $22, although it 
is likely that the fully-loaded cost will be higher than this. The cost of measuring 
CD4 counts is assumed to be $10 (ref. 44). The current annual cost (including 
supply chain) of the first-line regimen of efavirenz, emtricitabine and tenofovir 
(assumed to be used as a fixed-dose combination) is assumed to be $144 per 
person per year and for the second-line regimen of zidovudine, emtricitabine 
and ritonavir-boosted atazanavir to be $312 per person per year*°. Annual pro- 
gramme costs for clinic visits (not including drug, or viral load or CD4 count 
tests) are $80 per year’, with an assumed reduction to $40 per year, after 
measurement of viral suppression because of reduced clinical visit frequency 
of every 6 months from every 1 to 3 months (with interim pharmacy-only vis- 
its, depending on the amount of drug that can be dispensed). 


RESULTS 

The status of the simulated population in 2014 is shown in Table S1 in Mod- 
elling Methods in the Supplementary Information. Mean predicted out- 
comes over 20 years are shown in Table 2. The proportion of people who are 
taking or have taken ART (ART-experienced), who have fulfilled the criteria 
for failure of first-line ART is lowest with no monitoring and is below 15% 
for each of the clinical monitoring strategies. It is highest for the CD4 count 
monitoring (WHO) strategy (41%) because the failure definition is fulfilled 
if the CD4 count is below the pre-ART baseline level (which can occur due 
to high CD4 count variability, and particularly if ART has been interrupted). 
The proportion is intermediate for the CD4 count monitoring (<200) strate- 
gy and viral-load-informed differentiated care using DBS strategies (at 26% 
and 27%, respectively). The proportion of all people on ART who have viral 
suppression is highest with the viral-load-informed differentiated care using 
DBS strategy (86%) and lowest with no monitoring (76%), with the small 
range of 10% reflecting the generally high levels of adherence (although we 
consider in sensitivity analyses a situation in which adherence levels are 
lower and the proportion with viral suppression is accordingly lower). The 
death rate is markedly lower for the CD4 count and viral-load monitoring 
strategies than for the other strategies, and this is particularly evident in 
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those among whom viral-load failure has occurred. Notably, there is also 
a benefit of viral-load-informed differentiated care using DBS on HIV inci- 
dence over all the other strategies. 

Costs and their components by monitoring strategy are shown in Fig- 
ure 1. Programme costs for clinic visits are lowest with viral-load-informed 
differentiated care using DBS owing to the reduction in clinic visit frequen- 
cy among virally-suppressed people. Figure 2 shows the cost-effectiveness 
plane, showing the total incremental DALYs averted in the population over 
20 years, together with the incremental costs (both discounted), compared 
with no monitoring. Owing to the higher death rate of people on ART and 
higher HIV incidence, the clinical monitoring strategies avert fewer DALYs 
than the viral load and CD4-count-based monitoring strategies. Additional 
costs incurred are highest for CD4-count monitoring, particularly the CD4 
count monitoring (WHO) strategy. Viral-load-informed differentiated care 
using DBS averts a similar number of DALYs as CD4-count monitoring and 
is the most cost-effective strategy owing to the reduction in non-ART pro- 
gramme costs in people with viral suppression, with an incremental cost-ef- 
fectiveness ratio (ICER) of $326 per DALY averted. Figure 3 depicts how 
the cost-effectiveness is affected by the assumed costs of viral-load tests 
and savings in clinic visit costs in people with suppressed viral load. In our 
base case viral-load test cost of $22, viral-load-informed differentiated care 
is cost-effective only if reduced clinic visits provide at least a $30 per person 
per year saving offset. 

The effect of varying model assumptions are shown in Figure 4 and Sup- 
plementary Figure 1. Changes in the sensitivity and specificity of viral-load 
measurement using whole blood (as used for DBS) did not markedly influ- 
ence the ICER, nor did the extent of the assumed effect of viral-load meas- 
urement >1,000 cps ml-' on adherence. The ICER for viral-load-informed 
differentiated care was lower when we assumed lower population adher- 
ence and when we assumed higher population levels of unprotected sex, 
resulting in higher HIV incidence. In a scenario with a switch rate of zero, 
viral-load-informed differentiated care was cost saving. Confirming the re- 
sults shown in Figure 3, if no reduction in visit frequency is assumed with vi- 
ral-load monitoring (Supplementary Fig. 1u) then it is not cost-effective. The 
only other scenarios in which viral-load-informed differentiated care was not 
cost-effective was when we considered a 10-year time horizon instead of 20 
years and when we considered a doubling of rate of ART interruption in peo- 
ple with a reduced visit frequency owing to viral load being <1,000 cps mI" 
(Figure 4 and Supplementary Fig. 1q and r). 
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Table 2 | Outcomes over 20 years (2015-2035) in people with HIV (age 15-65), according to monitoring strategy. 


Clinical 
monitoring 


Clinical 
monitoring 
viral load 
confirmation 


No monitoring 


Percentage of ART- 7% 14% 10% 
experienced people who 

have fulfilled criterion for 

failure of first-line ART 


3% 13% 10% 


Percentage of people 76% 79% 78% 
on ART who have (true) 

viral load <1,000 cps ml* 

(mean; over 20-year time 

horizon) 


3.63 4.06 


Death rate (per 100 5.45 4.91 5.2 
person years) among 
people with HIV 


Death rate (per 100 9.94 75 8.66 
person years) among 

people on ART who 

have virologically failed 

first-line ART (regardless 

of whether monitoring 

strategy has detected it) 


0.81 0.83 


Clinical CD4 count CD4 count Viral-load- 
monitoring monitoring monitoring informed 
CD4 count (WHO) (<200) differentiated 
confirmation care using DBS 
13% 41% 26% 27% 

13% 38% 24% 25% 

79% 85% 82% 86% 

3.67 3.02 3.07 3.18 

4.93 4.36 4.43 447 

1.63 1.56 1.58 1.57 

7.62 5.53 5.79 5.85 

0.81 0.76 0.79 0.73 


For each model run for each strategy, the outcome of interest (as listed in the first column) is output for each 3-month period between 2015-2035. Over 500 model runs are done for each strategy, then means 
are taken over 3-month periods and model runs. ART, antiretroviral therapy; Cps, copies; DBS, dried blood spot; WHO, World Health Organization. 


In the base case we have considered there to be a switch rate of 0.5 
per 3 months after the strategy-specific failure criteria have been met. In 
practice, in most settings, despite CD4 counts being measured, switching 
rates are much lower than this. We compared use of the CD4 count mon- 
itoring (WHO) strategy with a low switch rate of 0.05 per 3 months (the 
current situation in many countries) with viral-load-informed differentiated 
care with a switch rate of 0.5 per 3 months (Fig. 5). The results suggest that 
introduction of the viral-load-informed differentiated care using DBS accom- 
panied by a high switch rate would lead to a substantial improvement in 
DALYs averted with a potential reduction in cost, compared with the current 
situation. In the simulated model population of Zimbabwe, over 20 years the 
CD4 count monitoring (WHO) strategy averts 540,000 DALYs compared 
with no monitoring at a cost of $500 million, whereas viral-load-informed 
differentiated care using DBS averts 1.12 million DALYs compared with no 
monitoring at a cost of $361 million. 

We also consider only the viral-load-informed differentiated care strat- 
egy and assess the effect of variations in various aspects (Fig. 6); whether 
whole blood or plasma is used, whether the test is POC (central laboratory 
testing using whole blood is our DBS scenario above), the threshold to define 
failure (200, 1,000 or 5,000 cps ml*', which is only assessed in the context 
of plasma), and the frequency of measurement (every 6 months, annual- 
ly or every 2 years). Monitoring every 6 months instead of annually averts 
more DALYs, but does not seem to be cost-effective at the $500 threshold 
(ICER = $1,234). Less frequent monitoring (such as every 2 years) would be 
cost-effective if it were to avert a similar number of DALYs to monitoring 
every year. However, implementing differentiated care based on viral-load 
monitoring as infrequently as every 2 years is currently untested and the po- 
tential health consequences are unknown, so this strategy is excluded from 
the comparison (Fig. 6a). Using the 5,000 cps ml"! threshold also averts 
DALYs at a similar ICER to the 1,000 cps ml" threshold, but with reduced 
total benefit. Use of a whole blood sample (for example, DBS) instead of a 
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plasma sample is not predicted to result in a marked difference in cost in- 
curred (assuming the same unit cost per test) and a modest (4%) benefit in 
DALYs averted. There is a small (6%) predicted benefit of POC testing over 
laboratory monitoring in DALYs averted owing to the fact that the 3-month 
delay is avoided. 


DISCUSSION 

Our results suggest that viral-load-informed differentiated ART care, using 
DBS sampling if necessary, is likely to be cost-effective in low-income set- 
tings in sub-Saharan Africa and is a sustainable model for providing ART. That 
said, the level of savings that result from reduced clinic visits and that can be 
realized in practice with differentiated care are, so far, uncertain and require 
monitoring. The level of savings required depends partially on the cost of vi- 
ral-load testing. With the viral-load test cost of $22 as used in our base case, 
an annual saving of at least $30 per year in those with viral suppression is 
required for viral-load-informed differentiated ART care to be cost-effective. 
Given that annual non-ART-programme costs average around $80 per year? 
if patients are being seen every 1 to 3 months, a reduction in visit frequency 
to once every 6 months, and perhaps for long-term suppressed patients to 
every 9 to 12 months, should enable such savings. There is little evidence that 
patients seen at sites with higher non-ART-programme costs have better out- 
comes’. We estimate, based on modelling of Zimbabwe over 20 years, that 
in contrast to the current situation in many countries (CD4 count monitor- 
ing with low switch rates), introduction of viral-load-informed differentiated 
care would more than double the number of DALYs averted compared with 
no monitoring (1.12 million compared with 0.54 million) and deliver these at 
reduced costs ($360 million compared with $500 million). 

A reduction in the frequency of clinic visits could also affect patients’ ad- 
herence to ART and retention in care. There is evidence that some patients 
default from care because they are unable to keep up with the intensive 
clinic visit schedule owing to travel time and cost, and loss of work time’. 
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Figure 3 | Viral load cost-effectiveness. Indication of whether viral-load- 
informed differentiated care is the most cost-effective monitoring strategy 
according to cost of viral load tests and reduction in non-antiretroviral (ART) 
programme costs in people with viral suppression. In the context of cost- 
effectiveness threshold US$500. Colours indicate which monitoring strategy 
is economically preferred. 


Notably, retention in care was more than 90% at 4 years among individ- 
uals enrolled in community ART clubs in Mozambique, owing, in part, to 
community-based adherence support, decreased travel requirements and 
patient preference*®4”, We did not include in our model any such adherence 
or retention benefits associated with differentiated care. There is also the 
possibility that patients may feel less connected to care with a differentiated 
care model, and this might have adverse consequences for adherence and 
retention, although so far there is little evidence to suggest this. 

When using the CD4 count to monitor people on ART, the WHO- 
recommended approach has been to define failure by a CD4 count of 
<100 cells mm of blood or a decline from pre-ART baseline. Our modelling 
suggests that, given the high variability in CD4 count and the fact that it is 
not uncommon for people to interrupt ART for periods of time, this latter 
component results in low specificity and in many patients with viral suppres- 
sion being incorrectly categorized as failing and hence switched unnecessar- 
ily. The alternative approach we evaluated, similar to that used in the DART 
trial*®, is to define failure based on a CD4 count of <100 mm’? in years 1-3 
on ART, and a CD4 count <200 mm? thereafter. This approach performed 
well in our modelling in terms of the death rate of people on ART (as it did 
in the trial itself), although it still resulted in a lower rate of viral suppression 
and hence a higher HIV incidence than with viral-load monitoring, result- 
ing in poorer overall effectiveness. In settings that continue to have a CD4 
count capacity, but not viral-load capacity, this suggests that the CD4 count 
monitoring (<200) strategy should be used until viral-load-informed differ- 
entiated care is introduced. 

The requirement for frequent clinic visits is partially driven by short- 
ages of ART supplies at a national level, resulting in clinic level rationing of 
ART quantities dispensed to patients at each visit. Increasing country buffer 
stocks, as well as improving forecasting of need, could enable longer drug 
supplies to be prescribed. However, even if it is not possible to prescribe 
more than a 1-2 months supply of drugs, various approaches can be consid- 
ered to prevent patients from having to make frequent pharmacy-only visits 
to clinic*®*74954, These include community ART groups, whereby one person 
picks up the drugs for all the members or situations in which patients can 
pick up medicines in a shop or other non-clinical setting®>. Other hurdles to 
overcome in adopting viral-load-driven reductions in frequency of clinical 
visits include obtaining buy-in from Ministries of Health for any required 
task shifting, and provision of human resources for dedicated adherence 
support for people with high viral load. In addition, support from professional 
associations of clinical, nursing and pharmacy staff will be important. 

The fact that the viral load is a direct measure of the ongoing effect 
of treatment means it provides an ideal means to differentiate care provi- 
sion. However, given the wider availability of CD4 count tests, it might be 
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Figure 4 | Incremental cost-effectiveness ratio (ICER) for viral-load-informed 
differentiated care using dried blood spot (DBS) (compared with next less ef- 
fective strategy on the efficiency frontier) according to changes in assumptions 
(see Supplementary Figure 1). a, DBS sensitivity 96% and specificity 79% for 
1,000 copies per millilitre threshold (versus plasma). b, DBS sensitivity 71% and 
specificity 97% for 1,000 cps ml threshold (versus plasma). c, DBS sensitivity 
88% and specificity 93% for 1,000 cps ml™ threshold (versus plasma). d, DBS 
sensitivity 85% and specificity 79% for 1,000 cps ml threshold (versus plasma). 
e, Poorer population antiretroviral therapy (ART) adherence profile such that 
proportion with viral suppression with no monitoring/no second-line ART is 
68% compared with 76% in base case and HIV incidence is 0.96 per 100 person 
years compared with 0.84 in base case. f, Future greater increase in sexual 
behaviour in population such that HIV incidence is 1.46 per 100 person years 
compared with 0.84 in base case. g, Permanent increase in adherence as a result 
of viral load measurement alert in none rather than 40%. h, Permanent increase 
in adherence as a result of viral load measurement alert in 100% rather than 
40%. i, Policy of initiation of ART at diagnosis. j, Reduced frequency of visits if 
CD4 count measured >350 in past year. k, Switch rate of 0 (so only benefit of 
monitoring is to inform who should be seen less frequently). l, Lower prevalence 
of HIV in 2014 (6% instead of 15% in base case). m, Higher prevalence of HIV 

in 2015 (33% instead of 15% in base case). n, Lower proportion on ART in 2015 
(33% instead of 56% in base case). 0, Higher proportion on ART in 2015 (70% 
instead of 56% in base case). p, 5% discount rate instead of 3%. q, Ten year time 
horizon instead of 20 year. r, Two times higher rate of ART interruption if visit 
frequency has been reduced due to viral load being <1,000 cps ml. s, wo 
times lower rate of ART interruption. Based on 200 model runs per strategy for 
each of a-s. 


suggested that the CD4 count could be used instead. For example, vis- 
it frequency for people with a CD4 count of more than 350 mm? could 


be reduced. This would result in a similar reduction in clinic visit costs to 
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viral-load-informed differentiated care. The effectiveness of such an ap- 
proach is unknown, however, and it would lead to some people in whom 
adherence is low and/or resistance is present, and viral load is high, being 
asked to visit clinic less frequently. It is well established that CD4 counts 
can remain high when virological failure is occurring®? and, likewise, that the 
CD4 count can remain low despite full virologic suppression. Thus, the neg- 
ative effects of such a strategy would be a concern and, although we did 
model this as a potential strategy (Supplementary Fig. 1j) it is possible that 
we did not fully capture the extent of these negative effects. 

We have largely focussed on use of DBS rather than plasma collection 
as an approach. Although plasma samples from a venepuncture and sample 
separation are an ideal sample, for transport of more than 6-24 hours this 
requires cold temperatures and so the approach is only likely to be applica- 
ble in areas for which samples can reach the laboratory in that time. 

Although we have argued that a DBS approach is feasible in most set- 
tings, this is not to say that the approach is working well everywhere”. It is 
important that there is investment in improvements to existing systems, in- 
cluding diagnostics laboratories and logistics of specimen distribution, and 
we have endeavoured to capture these costs as part of our overall costs of 
delivering viral-load testing using DBS. It is notable that most studies that 
have evaluated viral load using DBS compared with plasma have been per- 
formed in a laboratory setting using venepuncture samples and a capillary 
tube (which measures a precise 100 ul whole blood) to fill in the DBS card. 
Few studies are available to assess the performance of DBS in the real-world 
scenario — where it is hot, where sample-transport times are long, where 
venepuncture is not an option, and where samples are from a finger prick 
rather than a capillary tube — although one such study has found encour- 
aging findings?”. Our finding that viral-load-informed differentiated care is 
cost-effective was robust to low levels of sensitivity or specificity using DBS 
(Fig. 4, Supplementary Fig. 1). 

We simplified the comparison of types of viral-load test by breaking 
them down according to whether they are done at POC or in a laboratory 
and whether the sample consisted of whole blood or plasma. We recognise 
that this is something of an oversimplification in that, for example, meas- 
urement of viral load by POC testing on whole blood may not always have 
the same sensitivity or specificity as using whole blood in the form of DBS. 
Improved sensitivity and specificity compared with DBS offers a modest, but 
real benefit, as does the ability to measure the viral-load level such that it 
can be acted on in the same day, avoiding a delay until the next visit or the 
need to contact and recall the patient. Even if a POC viral-load test with the 
desirable properties we considered does become available it is likely that 
countries would use a mix of approaches (plasma samples, DBS and POC) 
depending on settings. It should be noted that the cost we assumed for a 
POC assay of $22 was used as a placeholder for the actual cost when this is 
known. It is uncertain whether such tests will be able to be delivered at this 
cost, as a fully-loaded cost, which takes account of staff operator time, and 
our results should be interpreted in the light of this. 

If differentiated care can be successfully implemented using viral-load 
monitoring less frequently than every 12 months (for example, every 24 
months) our modelling suggests that less frequent monitoring would be ex- 
pected to be cost-effective. However, the health risks of differentiated care 
with such infrequent viral-load monitoring are not well understood and may 
not have been fully captured in our model. Further evidence on whether this 
approach is feasible, and the health consequences of its implementation, is 
required. Only in highly resourced health-care systems (with a cost-effec- 
tiveness threshold of more than $1,400 per DALY averted) is more-frequent 
monitoring (for example, every 6 months) expected to be cost-effective. 

We found little evidence to support substantial benefits associated with 
increasing or decreasing the cut-off (viral-load counts >1,000 cps ml”) at 
which treatment is considered to have failed. A cut-off of 200 cps ml" re- 
sults in more DALYs being averted — due to identifying people with virolog- 
ical failure earlier — but relies on a plasma-based test (and phlebotomy to 
achieve sufficient sample volume) and does not meet the $500 cost-effec- 
tiveness threshold. 

Given the role of viral-load testing for enabling reduced visit frequency, it 
should also have a role in people on second-line regimens. When evaluating 
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Figure 5 | Comparison with the current situation. The current situation, CD4 
count (World Health Organization, WHO) monitoring with a low rate of 
switching in those meeting the failure criteria (0.05 per 3 months), and viral- 
load-informed differentiated care with switch rate as in our base case (0.5 per 3 
months). DBS, dried blood spot. 


our monitoring strategies we assumed that CD4 count or viral-load tests 
would only be done in patients on first-line treatment, so we may have un- 
derstated the benefits of viral-load-informed differentiated care. 

We considered whether our base case results would still hold with var- 
ious alterations in assumptions and settings. In a scenario in which the pat- 
tern of adherence was generally poorer than in our base case (leading to 
68% of people on ART with viral suppression compared with 82% in our 
base case) viral-load-informed differentiated care remained cost-effective. 
This was also true in a scenario with high HIV incidence rate, and scenarios 
with different HIV prevalence and ART coverage, suggesting that our find- 
ings should hold in various settings in the region. 

Randomized trials have been performed to compare outcomes from 
CD4 count and viral-load monitoring, and these have not identified signif- 
icant differences in outcomes. Such trials have been characterized by rela- 
tively short follow-up and low implementation of switching to second-line 
therapy®* “4, leading to low power to detect differences. 

We focussed on monitoring for adults. In children and, more likely, ad- 
olescents levels of adherence may be lower than in adults. We did find that 
our main findings hold in populations with a tendency for lower adherence. 
However, there may be a greater reluctance to reduce visit frequency as chil- 
dren are growing up and constantly facing new challenges and situations and 
clinic staff may wish for regular contact to ensure that these new challenges 
have not led to a drop in adherence. Likewise, there may be a reluctance to 
reduce visit frequency for women in the year or so post-partum. We also 
considered whether monitoring more-intensively — every 6 months rather 
than every 12 months — would be cost-effective for populations with a poor- 
er adherence profile (Supplementary Fig. 1t), but this was not the case. Other 
limitations of this work include the fact that we considered a hypothetical 
cohort with simulated outcomes, and future trends are uncertain — particu- 
larly in sexual behaviour, levels of male circumcision and adherence to ART. 
Furthermore, we assume continuation of HIV testing and ART availability at 
current trends. The profile of new POC viral-load tests is as yet uncertain, as 
is their cost. However, new diagnostic technologies, including POC viral-load 
testing and beyond, have great potential to enhance delivery of HIV care. We 
have investigated uncertainty through a series of one-way and multi-way 
sensitivity analyses and recognize that there are other approaches, such 
as probabilistic sensitivity analysis and approximate Bayesian computation 
that we intend to pursue in further work. 

This work provides insight into how to deliver ART monitoring so that 
it is both effective and cost-effective. As well as providing some specific 
guidance to programmes, it highlights the need to research this area further 
to enable us to continue to understand the attributes of programmes and 
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Figure 6 | Cost-effectiveness planes showing the effect of viral load measure- 
ment frequency, format and threshold, all in the context of viral-load-informed 
differentiated care. a, Viral load monitoring every 12-months is compared with 
every 6 months (every 2-year monitoring is excluded from the cost-effective- 
ness frontier due to unproven ability to base differentiated care on a 2-yearly 
value; however, if less frequent monitoring could be implemented without 
adverse health outcomes this would be cost-effective). b, Laboratory whole 
blood corresponds to dried blood spot (DBS). c, Alternative thresholds to de- 
fine failure (viral load >200, >1,000 and >5,000 cps ml’) are compared in the 
context of laboratory monitoring every 12 months using plasma. 


S74 


to determine how maximum health gains can be realized for patients with 
the resources available. We find that evidence is sufficient to recommend 
viral-load-informed differentiated care that uses DBS, but that further em- 
pirical confirmation as the approach is rolled out would be valuable. 
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Systematic review and meta-analysis of 
community and facility-based HIV testing to ad- 
dress linkage to care gaps in sub-Saharan Africa 


Monisha Sharma’, Roger Ying?, Gillian Tarr! & Ruanne Barnabas! 23:4 


HIV testing and counselling is the first crucial step for linkage to HIV treatment and prevention. However, despite high HIV bur- 
den in sub-Saharan Africa, testing coverage is low, particularly among young adults and men. Community-based HIV testing and 
counselling (testing outside of health facilities) has the potential to reduce coverage gaps, but the relative impact of different 
modalities is not well assessed. We conducted a systematic review of HIV testing modalities, characterizing community (home, 
mobile, index, key populations, campaign, workplace and self-testing) and facility approaches by population reached, HIV pos- 
itivity, CD4 count at diagnosis and linkage. Of 2,520 abstracts screened, 126 met eligibility criteria. Community HIV testing and 
counselling had high coverage and uptake and identified HIV-positive people at higher CD4 counts than facility testing. Mobile 
HIV testing reached the highest proportion of men of all modalities examined (50%, 95% confidence interval (Cl) = 47-54%) 
and home with self-testing reached the highest proportion of young adults (66%, 95% Cl = 65-67%). Few studies evaluated HIV 
testing for key populations (commercial sex workers and men who have sex with men), but these interventions yielded high HIV 
positivity (38%, 95% Cl = 19-62%) combined with the highest proportion of first-time testers (78%, 95% Cl = 63-88%), indicating 
service gaps. Community testing with facilitated linkage (for example, counsellor follow-up to support linkage) achieved high 
linkage to care (95%, 95% Cl = 87-98%) and antiretroviral initiation (75%, 95% Cl = 68-82%). Expanding home and mobile testing, 
self-testing and outreach to key populations with facilitated linkage can increase the proportion of men, young adults and high- 


risk individuals linked to HIV treatment and prevention, and decrease HIV burden. 
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lobally, there are around 2.3 million new HIV infections annually, 

80% of which occur in sub-Saharan Africa’. Despite the high burden, 

only one-third of adults in sub-Saharan Africa have been tested for 
HIV in the past year and less than 50% of HIV-positive individuals know their 
status”*. Knowledge of one’s serostatus is vital for accessing lifesaving antiret- 
roviral therapy (ART) and linking to HIV prevention. Conventional facility-based 
HIV testing and counselling (HTC) has not achieved high testing coverage in 
sub-Saharan Africa and will probably be insufficient to meet UNAIDS ambi- 
tious 90-90-90 targets — 90% of HIV-positive people knowing their status, 
90% of HIV-positive people who are aware of their status on ART, and 90% of 
people on ART virally suppressed*®. Barriers to facility testing include distance 
from clinic, long wait times, costs (transportation, lost wages and childcare), 
confidentiality concerns, low perceived risk and infrequent contact with the 
health-care system®. In addition, patients often present at facilities late in the 
course of their illness, increasing HIV morbidity, mortality and transmission’. 
Community-based HTC (conducted outside of a health facility) has the poten- 
tial to overcome these barriers, achieve high coverage, and identify asympto- 
matic HIV-positive individuals at high CD4 counts®’. In addition, community 
HTC may reach more men, young adults, and key populations than facility HTC. 
Community-based strategies also require minimal infrastructure allowing for 
easier scale up'??, 


Community HTC modalities include: home, mobile, workplace, index part- 
ner/family members (sexual partners or family members of HIV-positive indi- 
viduals) and as part of a campaign. Uptake and demographics of populations 
reached can vary widely by modality’. A large number of studies on HTC have 
been conducted in sub-Saharan Africa and a previous systematic review was 
completed in 2012, but facility testing was not included and uptake in men and 
young adults was not assessed. In addition, several large-scale interventions 
have been published since 2012 (refs 11, 13-15). Recently, the World Health Or- 
ganization released guidelines that strongly recommend implementing commu- 
nity HTC'S. As most countries have multiple and varying epidemics, UNAIDS 
recommends creating regional policies tailored to the macroepidemic rather 
than nationwide approaches”. Local policymakers will need to determine the 
optimal combination of community HTC interventions to increase testing in the 
context of their country’s HIV epidemic. 

To provide evidence for decision makers, we summarize the literature on 
community and facility-based HTC. We characterize each modality by popula- 
tion coverage, since high coverage is beneficial to both HIV-positive and -neg- 
ative people. HTC can reduce risk behaviour in HIV-negative individuals, while 
providing a means to link them to primary prevention (including circumcision 
and pre-exposure prophylaxis (PrEP))'®?'. We evaluate effectiveness in reaching 
men and young adults (both groups have low HIV testing and poorer clinical 
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Figure 1 | Pooled coverage and uptake of HIV testing and counselling (HTC) modalities. Coverage is defined as total number of people tested/total number of people 
in the target population. Uptake is defined as total number of people tested/total number of people offered testing. Bars indicate 95% confidence intervals of 
random effects meta-analyses. N, sample size; PITC, provider-initiated testing and counselling; VCT, voluntary counselling and testing. 


outcomes once infected?**) and targeted HTC for key populations (men who 
have sex with men (MSM), commercial sex workers (CSWs) and people who 
inject drugs (PWID)) — groups that generally have very high HIV prevalence 
and low access to health care*’. We assess HIV positivity to characterize yield 
and examine CD4 count at diagnosis to identify modalities that have the po- 
tential to link infected individuals to care earlier in their disease course. Es- 
timates from our analysis can also be used as parameters in mathematical 
models to project the long-term impact of HTC interventions. 


METHODS 


Inclusion criteria. We conducted a systematic literature review following 
Cochrane and PRISMA (preferred reporting items for systematic reviews and 
meta-analyses) guidelines”®. Studies were eligible for inclusion if they report- 
ed data on at least one of the following outcomes: coverage (individuals who 
accepted HTCfeligible target population); uptake (individuals who accepted 
HTC/individuals offered HTC); proportion of young adults (either under 25 or 
under 30 years); proportion of men; proportion of first-time testers; HIV pos- 
itivity (number positive/total tested); proportion with a CD4 count of 350 
cells ul’ or less; proportion linked to care (those who had visited a clinic, ob- 
tained a CD4 count or initiated ART); proportion retained in care (individuals 
retained/individuals who initiated ART); or cost per person tested. The target 
population was defined as the eligible population in the catchment area, ei- 
ther enumerated by the study (often the case for home HTC) or estimated 
(often the case for mobile and campaign HTC). For facility HTC, the target 
population was defined as people visiting the clinic, and for index partner or 
family members it was defined as all sexual partners or cohabitating family 
members listed by the index patient. With the exception of HTC targeted to 
key populations, we excluded HTC studies not related to general population 
screening, including case reports and studies limited to antenatal or paediat- 
ric settings, or to patients with specific diseases (for example, tuberculosis). 
Observational (cross-sectional and cohort) studies and randomized trials 
were eligible for inclusion. Studies were included in the analyses more than 
once if they had different arms or multiple study sites (for example, urban 
and rural settings or different countries). If more than one wave of a survey or 
intervention was completed, only the most recent was used. 


Search strategy. Literature searches were conducted with the help of a librari- 
anon 22 July 2014 and updated on 10 June, 2015. Briefly, we searched PubMed, 
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EMBASE, Cochrane Library, Global Health Database, African Index Medicus, 
and conference abstracts (CROI, R4P, IAS) using MeSH terms for PubMed and 
comparable terms for other databases. Search terms included “HIV infections/ 
diagnosis” AND “Africa South of the Sahara” AND (“mass screening” OR test 
OR tests OR testing OR screen* OR diagnosis OR “counseling”). Bibliographies 
of relevant papers were screened and authors were contacted for missing out- 
comes. Searches were limited to human studies published between 2000 and 
2015. The full strategy is described in the Supplementary Information. 


Definitions of HTC modalities. Community-based HTC was defined as test- 
ing conducted outside of health facilities. Facility-based HTC was conducted in 
health-care facilities (clinics, hospitals, fixed stand-alone voluntary counselling 
and testing sites). Facility HTC was divided into two categories: voluntary coun- 
selling and testing (VCT), which is patient-initiated testing and provider-initiated 
testing and counselling (PITC), which is routine; or opt-out HTC that is initiated 
by a provider. Community HTC modalities included home (offering HTC door- 
to-door to a catchment area), mobile (setting up a mobile van or container to 
provide HTC in a central area of a community), index partner or family member 
(offering HTC to individuals who may have been exposed to HIV by a sexual part- 
ner or who have an HIV-positive household member), campaign (short — gener- 
ally 1 to 2 weeks — intensive community mobilization followed by mobile testing, 
often partnered with other health interventions), key populations (targeted to 
MSM, CSWs and PWID) and workplace (offered at a place of employment). We 
examined a subset of home and workplace HTC that used self-testing. 


Data screening and extraction. M.S., RY. and R.V.B. screened abstracts for initial 
inclusion. Disagreements were adjudicated by reviewing the full text. M.S., R.V.B., 
RY. and G.T. reviewed papers for eligibility and used a standardized extraction form 
to characterize eligible studies (Supplementary Information 2). Study quality was 
rated low, moderate or high based on representativeness of underlying population, 
follow-up (present or absent), assessment of outcomes, and number of outcomes 
presented. Costs were inflated to 2012 US dollars by converting to local currency 
units, multiplying by the ratio of each country's gross domestic product deflator 
(2012 deflator divided by base year deflator) and converting back to US dollars”. 


Statistical analysis. Random effects meta-analysis of single proportions with 
binomial exact confidence intervals (Cl) was used to summarize results. Pro- 
portions were stabilized using the Freeman-Tukey double arcsine transfor- 
mation unless the number of events was less than ten, in which case a logit 
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Figure 2 | Pooled percentage of men, young adults and first-time testers by HIV testing and counselling (HTC) modality. Bars indicate 95% confidence intervals of 
random effects meta-analyses. N, sample size. PITC, provider-initiated testing and counselling; VCT, voluntary counselling and testing. 


transformation was used because of convergence issues. Heterogeneity was 
quantified using the F statistic. For modalities with enough data (ten studies 
or more), trends were examined by year before 2005 (when the HIV rapid di- 
agnostic test was introduced), country and facilitated linkage. Analyses were 
conducted in R software using the metaprop function in the meta package”®. 


RESULTS 

We identified 126 eligible studies out of 2,520 abstracts (Supplementary Figure 
SO.a). Overall, 64% of studies were rated moderate or high quality (Supple- 
mentary Information 2). Most studies included in our analysis evaluated facility 
and home HTC. We identified far fewer studies on other types of community 
HTC: home with self-testing (n = 2), workplace with self-testing (n = 2), index 
partner/family member (n = 5), key populations (n = 5), campaign (n = 5) and 
workplace (n = 4). Forest plots of each outcome by modality are provided in the 
Supplementary Information with pooled estimates presented here. /? values of 
pooled estimates varied from 90% to 100%, reflecting high heterogeneity in 
study designs and countries included (Supplementary Information). The coun- 
tries represented varied by outcome with the greatest number of countries 
having data for home and facility HTC coverage, uptake and tester demograph- 
ics. Far fewer studies reported CD4 count at diagnosis and linkage to care out- 
comes; studies containing these data were mainly conducted in South Africa, 
Kenya and Uganda. All home self-testing studies were conducted in Malawi 
and the most key population studies were conducted in Nigeria. Overall, the 
largest number of studies were conducted in South Africa. 


Coverage and uptake 

Coverage was reported in 19 home HTC studies®"87?49, 1 mobile, 2 cam- 
paign*4”, 3 index partner/family member*®°°, 5 facility VCT°'°9, and 5 facility 
PITC studies®**'. Overall, community HTC modalities achieved higher cover- 
age than facility, with home (70%, 95% Cl = 58-79) and campaign (76%, 
95% Cl = 49-95%) having the highest population coverage (Fig. 1). Home 
HTC consistently achieved high coverage across 19 studies, whereas cam- 
paign coverage was also high, but based on only two studies. Pooled coverage 
was 37% (95%, Cl = 33-42%) for mobile HTC, from 1 study conducted in 
3 countries (South Africa, Tanzania and Zimbabwe). Coverage of index HTC 
was heterogeneous depending on target group (family members or sex- 
ual partners) and type of contact tracing (active or passive referral). Figure 
1 shows results for sexual partner tracing only (41%); full results are shown 
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in Supplementary Figure S18. Facility VCT (15%, 95% Cl = 9-21%) and PITC 
(18%, 95% Cl = 18-31%) attained the lowest coverage. 

Uptake was reported in 31 home HTC studies?"415182729-38.40-45,53,62-74° D home 
with self-testing™”>, 2 mobile™*°*, 3 index partner or family member*®*°, 4 cam- 
paign*®*77677, 3 workplace”*®°, 3 facility VCT>*°6°', and 11 facility PITC stud- 
jes969759.6081-87, Qyerall, community modalities had high uptake (Fig. 1). Home 
HTC had a pooled uptake of 82% (95% Cl = 76-87%) and home with self-test- 
ing had slightly lower uptake (69%, 95% Cl = 59-78). Mobile and campaign had 
the highest uptake (both 97%). Index uptake was 89% (95% Cl = 88-90%) for 
home testing of family members (Supplementary Figure S10) and 52% for sex- 
ual partners (95% Cl = 30-71%; Fig. 1). Uptake for facility VCT was defined as 
number tested divided by number referred for VCT by provider, for facility PITC 
it was defined as number tested divided by number offered PITC. We found 
higher uptake for people given routine PITC (73%, 95% Cl = 55-87%) com- 
pared with those referred to on site VCT (26%, 95% Cl = 15-39%). 


Demographics of testers 
The percentage of men out of total persons tested was reported in 25 home 
HTC studies®'418:29.31.32,37,38,41-45,63,64,66,68-72,88-90 2 home with self-testing™”>, 10 mo- 
bile!036872.9199 3 index partner4”4988, 3 campaign*©*”’6, 2 workplace'°"!, 20 facil- 
ity a and 13 facility PITC5860.82-84,86,99,108-113 (Fig. 2). 
Mobile had the highest percentage of men (50%, 95% Cl = 47-54%), whereas 
home had the lowest for general population HTC (40%, 95% Cl = 39-41%). 
Index partner testing had 41% men (95% Cl = 23-61%), but varied greatly by 
tracing strategy; active tracing had 50% men whereas passive clinic referral had 
only 15% (Supplementary Figure S18). Facility VCT and PITC both had 42% men. 
Percentage of participants reporting testing for the first time was included 
in 20 home HTC Studies? 827s 1s2s6 Aa cass c668-7288 11 mobile®266s2-sencs+ 3 
campaign*°*”77, 3 key populations?°"""°, 7 facility VCT129491739516 and 5 facility 
PITC®8688:11112" Pooled percentages of first-time testers were higher for commu- 
nity than facility modalities (Fig. 2). Percentages varied by country, with South 
Africa consistently having the lowest percentage of first-time testers across 
modalities (Supplementary Figures S23-S27). Key population interventions 
had the highest proportion of first-time testers (83%, 95% Cl = 71-91%), and 
mobile had the highest percentage among the general population (63%, 95% 
Cl = 50-74%). Home HTC had 58% first-time testers (95% Cl = 48-67%), 
and campaign had 55% (95% Cl = 20-91%), but was highly variable depending 
on the setting (Supplementary Figure S25). Facility VCT had 53% (95% Cl = 
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Figure 3 | Pooled HIV positivity and proportion of newly diagnosed HIV positivity with CD4 count of 350 cells pit or less by HIV testing and counselling (HTC) 
modality. Bars indicate 95% confidence intervals of random effects meta-analyses. N, sample size. CSWs, commercial sex workers; MSM, men who have sex with men; 
PITC, provider-initiated testing and counselling; PWID, people who inject drugs; VCT, voluntary counselling and testing. 


40-66%) and PITC had 55% (95% Cl = 48-62%) first-time testers. 

The percentage of young adults testers (either under 25 or 30 years) 
was reported in 17 home HTC studies?!®2931353738.45,.63.64,.68-70,73,7490.17, 1 home 
with self-testing", 13 mobile'121368919395-97103107114 index partner*®88, 2 
campaign?77’, 20 facility VC T12:5152,54,64,88,89,91-93,95,104-10714,118-120, and 6 facility 
PITC288286881013" Results varied considerably by study (Supplementary Fig- 
ures S29-S35). Community HTC generally tested a higher proportion of young 
adults than facility modalities; home with self-testing had the largest percent- 
age (66%, 95% Cl = 65-67%), followed by mobile, and then home (Fig. 2). 
Campaign reported 31% of young adults, but varied from 20-50% depending 
on the study (Supplementary Figure $32). Facility VCT had 46% (95% Cl = 
39-53%) and PITC had 38% (95% Cl = 39-53%). 


HIV positivity and CD4 count <350 cells ml* 
Yield of HIV-positive people (HIV positivity) was reported in 29 home stud- 
jeg '41518,27,29-32,34,36,38,41-45,63,65,66,68,70-73,88,89 1 home with self-testing", 12 mo- 
piletO68-7222-25. 229810310714 | 5 campaign*6477677120 3 workplace7?8071, 4 key 
population’2""6122, 4 index partner*®°°8®, 27 facility § VCT9456548184.86.91- 
29282802104 NAN S20 eS 2e and 17 facility PIT C56:57:59,60,81,83-88,99,110-113,126 studies. 
Community-based strategies for the general population had lower HIV positivity 
(6-11%) than facility HTC (18-20%), whereas targeted community HTC for key 
populations and sexual partners of index patients had the highest HIV yield (Fig. 
3). HTC interventions targeting sexual partners of index cases had 55% positivi- 
ty (95% Cl = 49-61%), those for MSM had 24% (95% Cl = 14-39%), for CSWs 
had 27% (95% Cl = 12-51%), and interventions targeting PWIDs had the lowest 
positivity of 3% (95% Cl = 1-15%). Index HTC for family members had similar 
HIV yield to home and mobile HTC (9%, 95% Cl = 5-14%) (Supplementary 
Figure S42). Forest plots of HIV positivity for each modality stratified by country 
are shown in Supplementary Figures S36-S44). HIV positivity for community 
HTC in the general population largely mirrored prevalence of the country where 
the study was conducted, with the exception of four countries with the highest 
prevalence: Mozambique, Swaziland, Botswana and Lesotho. These countries 
have adult HIV prevalence ranging from 22 to 27% (ref. 128), but HIV yield from 
home, mobile and campaign HTC was 5-12%. HIV positivity for facility VCT and 
PITC was generally higher than prevalence in the general population. 

The proportion of individuals with a CD4 count of 350 cells ul” or less 
at HIV diagnosis was reported in 7 home'438424365.7273, 3 mobile?!?4", 3 cam- 
paign#o4776, 8 facility VC T60:81107126,127129-131 and 5 facility PITC studies®!8199/126130 
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Community-based strategies identified HIV-positive individuals at higher 
CD4 counts than facility HTC, with campaign having the lowest proportion 
with a CD4 count of 350 cells ul or less (26%, 95% Cl = 22-30%) (Fig. 3). 
Home (39%, 95% Cl = 32-46%) and mobile (38%, 95% Cl = 36-41%) had 
similar proportions of HIV-positive individuals with a CD4 count of 350 cells 
ul or less, whereas facility VCT (66%, 95% Cl = 60-72%) and PITC (71%, 
95% Cl = 67-75%) had the highest proportion. 


Linkage and retention in care for HIV-positive people 
Linkage to care was defined as visiting a clinic for community HTC and re- 
turning to the clinic to obtain CD4 count results (or enrolling in pre-ART care) 
for facility HTC. Linkage was reported for ten home!#"9293441-43.65,72132 six mo- 
bile?4719224)133-135 two campaign’©”’, eight facility VC 7 2681,84,91,92.123,126,136 and five 
facility PITC studies®°8*87""26, Home and campaign interventions achieved a 
high proportion of individuals linked (95%, 95% Cl = 87-98%) when paired 
with facilitated linkage to care strategies (for example, lay-counsellor fol- 
low-up to encourage clinic visit); interventions without facilitated linkage 
achieved lower proportions of HIV-positive individuals visiting a clinic (26%, 
95% Cl = 18-36%) (Fig. 4). Mobile HTC achieved linkage rates of 37% (95% 
Cl = 24-51%); rates were highest in two interventions conducted in South Af- 
rica, one or which used incentivized monetary recruitment and another which 
used a call centre to encourage linkage after HTC**"™. Linkage to care from 
facility VCT was 61% (95% Cl = 48-72%) and from PITC was 55% (95% Cl 
= 39-71%) (Fig. 4). Time from HTC to linkage to care ascertainment varied by 
study (ranging from 1 to 12 months); the method of ascertainment (participant 
self-report or clinic record) also varied. 

Four home HTC studies reported ART initiation among those eligi- 
ble!#4143.65. Similar to linkage to care, ART initiation was higher in home inter- 
ventions with facilitated linkage (76%, 95% Cl = 68-82%) compared with 
those without facilitated linkage (16%, 95% Cl = 12-20%) (Fig. 5). ART in- 
itiation rates after home HTC with facilitated linkage were similar to those 
achieved through facility HTC. Initiation among those eligible was 64% 
(95% Cl = 54-72%) in facility VCT and 70% (95% Cl = 61-78%) in facility 
PITC, with 3 studies reporting initiation rates for VCT®"2"° and 4 for facility 
PITC608184.871_ Self-testing showed an ART initiation rate of 29% (95% Cl = 
17-45%), although this number is among all HIV-positive individuals and is 
not restricted to those who are ART eligible because point of care CD4 testing 
was not conducted" (Supplementary Figure S55). 


3 December 2015 | 7580 | 528 


100% + 
90% 4 i 
80% 4 
70% 4. 
60% 4 
50% 4 
40% 4 


30% 4 


Linked to care/total HIV* people 


20% 4 


10% 4 


0% 
Home & campaign 
(without facilitated linkage) 
N= 17,467 


Home & campaign 
(with facilitated linkage) 
N=999 


Community HTC 


(without facilitated linkage) 


SHARMA ETAL. | HIV TESTING AND LINKAGE 


Mobile Facility VCT Facility PITC 


N = 2,845 N= 2,681 N = 8,732 


Facility HTC 


Figure 4 | Linkage to care after community and facility HIV testing and counselling (HTC). Bars indicate 95% confidence intervals of random effects meta-analyses. N, 
sample size. PITC, provider-initiated testing and counselling; VCT, voluntary counselling and testing. 


One study reported retention in care at 12 months after ART initiation for 
home HTC" and two studies of both facility VCT and PITC reported reten- 
tion — one at 6 months® and one at 12 months®°. Not surprisingly, linkage 
rates were higher in the 6-month compared with the 12-month retention study 
(Supplementary Figure S59). Retention was highest for home HTC, although 
the sample size was small (93%, 95% Cl = 83-97%) (Fig. 5). Facility VCT 
achieved 53% (95% Cl = 32-71%) retention, and PITC retention achieved 
64% (95% Cl = 32-90%). 


Cost per person tested 

The average cost per person tested (2012 US dollars) for community HTC was 
$27.38 for mobile, $16.60 for index, $11.17 for campaign and $8.58 for home 
HTC8893103137-141 (Supplementary Table S2 and Figure S61). The cost per person 
tested was highest for stand-alone VCT ($36.78)®3"*2, Hospital and clinic 
HTC had similar costs ($12.56 and $12.32, respectively)®!8893!40142-"47 (Sup- 
plementary Table S3 and Figure S62). Costs were dependent on the country 
where the study was conducted, the costs that were included (start-up or on- 
going only) and the intervention scale. 


DISCUSSION 
Across modalities, community HTC successfully reached target groups (men, 
young adults and first-time testers) with higher coverage than facility HTC (Ta- 
ble 1). High uptake of community HTC reflects acceptability of testing outside 
of health-care facilities. Community HTC identified HIV-positive individuals with 
higher CD4 counts who were likely to be earlier in their disease course. Com- 
bined with the potential of community HTC with facilitated linkage to achieve 
high linkage to treatment with similar retention rates as facility HTC, this sug- 
gests that scaling up community interventions could reduce the morbidity, mor- 
tality and transmission associated with late or non-initiation of ART. Although 
community interventions test a large number of HIV-negative individuals, HTC 
can reduce risky sexual behaviour” and provide a means to link uninfected per- 
sons to primary prevention. This is particularly crucial for young women, who 
have high HIV incidence and can benefit from PrEP2'. Preventing HIV infections 
averts future treatment costs as well as morbidity. A recent modelling study 
found that ART scale up should be combined with primary prevention such as 
PrEP to achieve maximum HIV reduction"*®. High coverage of HTC can also re- 
duce stigma around testing. 

Each HTC modality reaches distinct subpopulations and a combination of 
strategies will probably be necessary to achieve high ART coverage. Mobile and 
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campaign HTC had high uptake (97%), as individuals who present at a mobile 
van or during a campaign are probably seeking out testing, but home HTC also 
achieved high uptake among people who were offered testing (82%). Home HTC 
also attained high population coverage, probably because offering testing door- 
to-door removes substantial barriers, including eliminating the need to actively 
seek out HIV testing’. However, home HTC is less likely to reach men and young 
adults. A recent home HTC intervention in Botswana reached 85% of women in 
the target population compared with just 50% of men'°. This may be because 
women are more likely to be home at times when the intervention is conducted. 

Campaign HTC has the potential to attain high coverage in large catchment 
areas and identify HIV-positive individuals at high CD4 counts (one-third of new- 
ly diagnosed HIV-positive individuals had a CD4 count of 350 cells tl or less 
compared with two-thirds or more for facility HTC). The multidisease focus of 
campaigns may reduce stigma of HIV testing interventions. Our results suggest 
that campaign HTC can be a successful strategy for countries seeking to increase 
overall testing coverage in a short time frame. 

Home HTC with self-testing reached the greatest proportion of young adults 
of all modalities examined" and is a promising strategy with high uptake". Young 
adults (age 15 to 24 years) represent 39% of new infections in those over 15 years 
old”, but have lower access to HTC and HIV care and poorer clinical outcomes 
than other age groups“. Home HTC with self-testing had slightly lower coverage 
and reached fewer first-time testers than home HTC administered by counsel- 
lors. The World Health Organization recommends HIV self-testing as an option 
for individuals who are unable or unwilling to receive counsellor-administered 
HTC. However, supervision improves interpretation of results!’ and a reactive 
self-test should not be considered a definitive diagnosis, as standard testing is 
needed to confirm results. More studies evaluating linkage to care following a 
positive self-test are needed"®. 

Mobile HTC is the most effective strategy for reaching men — a target group 
in sub-Saharan Africa. Men are more likely to be lost at each step of the HIV 
treatment cascade; they are less likely to undergo testing, more likely to start 
ART at an advanced disease stage and more likely to interrupt treatment — all of 
which leads to increased morbidity and mortality?2. Qualitative studies highlight 
men’s preference to test outside of facilities®*, so scale up of community inter- 
ventions can meet this need. Future studies could investigate HTC at predomi- 
nantly male workplaces, nightclubs or bars. 

Index testing of sexual partners through active contact tracing is an efficient 
high-yield method that should be scaled up. HIV positivity was 55% in this group 
and the intervention attained a high coverage (41%). The HIV prevalence we 
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Figure 5 | Pooled percentage initiated antiretroviral therapy (ART) between those eligible and retained in care among those who initiated ART. Bars indicate 95% 
confidence intervals of random effects meta-analyses. N, sample size. PITC, provider-initiated testing and counselling; VCT, voluntary counselling and testing. 


report is similar to that found in the literature — 45-50% in cohabitating partners 
of HIV-positive adults, most of whom are unaware of their status*®. Interestingly, 
high coverage of males was achieved only through active contact tracing, where- 
as passive tracing identified more women (Supplementary Figure S18). 

Facilitated linkage strategies are a key component of successful communi- 
ty-based HTC. Individuals testing at an HIV facility generally have higher rates 
of linking to care and initiating ART than those who test outside the health-care 
system. However, we found that high linkage rates (comparable with, or higher 
than, facility HTC) can be achieved with community HTC when individuals are 
followed-up to encourage linkage. 

Although scaling up community HTC with facilitated linkage is important, 
the benefits of improving facility HTC coverage should not be overlooked. Con- 
sistent with previous studies, our analysis finds opt-out facility PITC had much 
greater uptake than referring patients to VCT°°. However, coverage of PITC in 
health facilities is low, demonstrating missed opportunities to identify HIV-posi- 
tive individuals and to link them to care. For example, a Ugandan hospital report- 
ed only 50% of inpatients with HIV-related diagnoses were tested for HIV before 
leaving the hospital®*. PITC is an underused strategy in sub-Saharan Africa and 
scaling up testing would provide a safety net for those who do not independently 
seek HTC®", Because PITC identifies mainly symptomatic HIV-positive individ- 
uals with low CD4 counts as well as those with health-care access, it should be 
coupled with other modalities to maximize population coverage. 

Our review identified gaps where additional evidence is needed. A large pro- 
portion of CD4-count and linkage data came from South Africa, with Uganda 
and Kenya also well represented. South Africa has the lowest percentage of first- 
time testers, reflecting the successful scale-up of HTC. There are fewer studies 
from other parts of sub-Saharan Africa, which may limit how much the pooled 
estimates can be generalized. Also, few studies followed patients longitudinally 
and measured linkage to care, ART initiation, retention and viral suppression. In 
addition, although many studies evaluated home HTC, more data are needed for 
other community modalities, including campaign and workplace. 

Data were also limited for key populations. Despite having an HIV preva- 
lence up to eight times higher than the general population, interventions for key 
populations are scarce and scale up is urgently needed">"53. Key population in- 
terventions can reduce the spread of HIV in the general population"*. Currently, 
numerous policy barriers exist that restrict the availability and access of HIV-re- 
lated services for MSM and CSWs, including police harassment and criminal 
laws'*°, Only three HTC interventions were targeted to MSM and only one was 
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targeted to CSWs and PWIDs. Most key population HTC studies were from Ni- 
geria; data are needed from other parts of sub-Saharan Africa. We report a high 
HIV positivity combined with a high proportion of first-time testers in MSM and 
CSW groups, highlighting the need for service expansion. We found a lower HIV 
prevalence in PWIDs compared with MSM and CSW groups, reflecting sexual 
transmission as the main mode of HIV spread in sub-Saharan Africa. Successful 
HTC programmes for key populations are community based (particularly mo- 
bile) as many high-risk groups are marginalized and do not have access to con- 
ventional health systems’. Community-based HTC for MSM and PWIDs have 
been shown to have higher acceptance and greater HIV yield than clinic referral 
for HTC". In addition, self-testing is a potential strategy to reach key populations, 
as it demonstrates high acceptability and is considered convenient and private". 
Costs of community-based and facility-based HTC vary by modality, country, 
scale of intervention, linkage strategy and costs included. Generally, communi- 
ty-based HTC and integrated facility HTC costs were comparable. However, stand- 
alone HTC had the highest cost per person tested, indicating that integrated HTC 
may be more cost-efficient than stand-alone services (Supplementary Table S3). 
The limitations of our analysis included the heterogeneity across studies, 
which may not be accurately reflected in the pooled estimates. Differences 
in study design, geographical location (country, urban or rural area) and in- 
tervention year added to the heterogeneity. To address this, we used random 
effects meta-analysis and stratified on key variables (year <2005, country 
and facilitated linkage). In addition, large numbers of HIV-positive individu- 
als were lost to follow-up in studies that reported linkage, so we considered 
these individuals unlinked in our analyses. If individuals linked at another 
clinic, our estimates may be conservative”. Furthermore, assessment of 
linkage to care differed by study (self-report or clinic records review), as did 
time to linkage assessment, which varied from 1 to 12 months after HTC. In 
addition, CD4 count at diagnosis and ART uptake among those with eligible 
CD4 counts could only be assessed in community HTC interventions em- 
ploying point-of-care CD4, as studies that report CD4 only for those visiting 
a clinic would not provide accurate denominators. Only studies reporting 
linkage to care among those eligible for ART were included in our main anal- 
ysis. Also, estimates of coverage vary in their precision because some stud- 
ies conducted population enumeration and others used census estimates of 
the catchment area. Finally, proportion of first-time testers, men and young 
adults tested are crude measures of relative uptake. For example, for home 
HTC, it is not possible to discern whether the 40% of those tested being 
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Table 1 | Summary of HIV testing and counselling coverage and tester demographics. 


Parameter 


Self-testing 
(home) 


Coverage (accepted/ 70 58-79 37 33-42 76 
target population) 

Uptake (accepted/ 82 76-87 97 94-99 69 59-78 97 
offered) 

Young adult (age <25 49 43-54 51 44-58 66 65-67 31 
or 30) 

Men 40 39-42 50 41-54 44 42-48 41 
First-time testers 58 48-67 63 50-74 55 
CD4 <350 cells pl 39 32-46 38 36-41 26 
HIV positivity 10 8-12 11 8-13 8 Sail, 6 
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Key populations 


Facility VCT Facility PITC 


49-95 41 26-57 15 9-21 18 8-31 

93-99 50 31-71 26 15-39 73 55-87 
iis 46 39-53 38 24-54 
37-46 41 23-61 41 38-44 42 39-46 
28-81 83 71-91 53 40-66 55 48-62 
22-30 66 60-72 71 67-75 
4-10 55 49-61 16 9-26 18 13-23 20 17-24 


Cl, confidence interval; PITC, provider-initiated testing counselling; VCT, voluntary counselling and testing 


men reflects a lower coverage of men, or a greater coverage of women, or a 
combination of the two. Future studies reporting the number of men, first- 
time testers and young adults offered testing compared with those accept- 
ing testing would increase the accuracy of these measures. Our findings on 
uptake, HIV positivity and CD4 count at diagnosis are similar to a previously 
published meta-analysis’. 


This analysis characterizes linkage and populations reached by HTC mo- 


dalities to inform policymakers who are charged with addressing gaps in test- 
ing. Facility HTC, although important, is unlikely to be sufficient to curb the 
HIV epidemic because many people in sub-Saharan Africa do not have regular 
access to health care. Scaling a combination of community HTC, mobile test- 
ing to reach men, self-testing to reach young adults and outreach to high-risk 
populations, as appropriate to the local epidemic setting, is crucial to achieve 
high knowledge of serostatus and linkage to HIV treatment and prevention in 
sub-Saharan Africa. 
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The global burden of malaria has been substantially reduced over the past two decades. Future efforts to reduce malaria further 
will require moving beyond the treatment of clinical infections to targeting malaria transmission more broadly in the community. 
As such, the accurate identification of asymptomatic human infections, which can sustain a large proportion of transmission, is 
becoming a vital component of control and elimination programmes. We determined the relationship across common diagnos- 
tics used to measure malaria prevalence — polymerase chain reaction (PCR), rapid diagnostic test and microscopy — for the 
detection of Plasmodium falciparum infections in endemic populations based on a pooled analysis of cross-sectional data. We 
included data from more than 170,000 individuals comparing the detection by rapid diagnostic test and microscopy, and 30,000 
for detection by rapid diagnostic test and PCR. The analysis showed that, on average, rapid diagnostic tests detected 41% (95% 
confidence interval = 26-66%) of PCR-positive infections. Data for the comparison of rapid diagnostic test to PCR detection at 
high transmission intensity and in adults were sparse. Prevalence measured by rapid diagnostic test and microscopy was compa- 
rable, although rapid diagnostic test detected slightly more infections than microscopy. On average, microscopy captured 87% 
(95% confidence interval = 74-102%) of rapid diagnostic test-positive infections. The extent to which higher rapid diagnostic test 
detection reflects increased sensitivity, lack of specificity or both, is unclear. Once the contribution of asymptomatic individuals 
to the infectious reservoir is better defined, future analyses should ideally establish optimal detection limits of new diagnostics 


for use in control and elimination strategies. 
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ver the past two decades, considerable progress has been made in 
reducing the global malaria burden. Between 2000 and 2013 alone, 
malaria-related mortality decreased by 47% worldwide and 54% in 
Africa. In addition, more than half of malaria endemic countries are on track to 
meet global targets to reduce malaria incidence by 75% in 2015 (ref. 1). These 
achievements are largely due to the widespread use of insecticide-treated nets 
(ITNs) and highly effective antimalarial treatments. The treatment of sympto- 
matic cases in particular has been enabled by notable advances in the develop- 
ment and deployment of more accurate malaria diagnostics*?. However, efforts 
to reduce the burden of malaria infections further in the future will require mov- 
ing beyond the treatment of clinical infections to targeting transmission more 
broadly in the community. As such, the accurate identification of asymptomat- 
ic human infections, which can sustain a large proportion of transmission, is 
becoming a vital component of control and elimination programmes?*. 
Community chemotherapy (for example, mass screen and treat (MSAT) 
or mass drug administration (MDA) programmes) in conjunction with on- 
going vector control is an approach under consideration for the interruption 
of transmission. This is achieved through the direct treatment of potentially 
infectious individuals. In the case of MSAT strategies, delivering drugs spe- 
cifically on the basis of positive test results may be considered preferable to 


presumptive treatment because it provides clear benefit to the recipient and 
limits excess drug use that may drive antimalarial resistance. However, ow- 
ing to the insufficient sensitivity of existing field diagnostics used to identify 
asymptomatic infections, studies have shown that MSAT has limited effect in 
reducing transmission®®. 

Measuring parasite infection by microscopy has been the gold standard 
in malaria research for more than a century and remains relatively widespread 
as a point-of-care diagnostic in clinical and epidemiological settings. More 
recently, the advent of rapid diagnostic tests (RDTs), which measure the pres- 
ence of histidine-rich protein 2 (HRP2) for Plasmodium falciparum and/or lac- 
tate dehydrogenase for other Plasmodium species (pLDH), has expanded the 
range of diagnostic options. Originally developed to inform clinical treatment, 
RDTs are increasingly important for epidemiological characterization’ because 
of their low cost and field applicability. However, most only have reported de- 
tection limits in the range of 100 to 200 parasites per microlitre®? in compari- 
son with around 50 parasites per microlitre by expert microscopy”. 

Over the past three decades, the development of nucleic acid amplifica- 
tion tests has improved the detection limit for malaria infection to less than 1 
parasite per microlitre by ultrasensitive quantitative polymerase chain reac- 
tion (qPCR). Although these detection thresholds are more appropriate for 
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Figure 1 | Schematic of diagnostic detection limits with respect to parasite 
and HRP2 density. The black curve indicates parasite density and the red 
curve indicates HRP2 density. Time scale is in days prior to treatment and in 
weeks after treatment. Horizontal lines are the detection limits of respective 
diagnostics. The blue shaded area shows detectability of parasites by microscopy 
and/or polymerase chain reaction (PCR), whereas the red shaded area shows 
detectability of HRP2 by rapid diagnostic test (RDT). 


measuring low-density infections than microscopy and RDTs, most PCR tech- 
niques remain impractical for wide-scale use in field surveys owing to cost, 
processing time and the lack of appropriate laboratory facilities in many en- 
demic countries'®. Comparative analysis of malaria prevalence, measured by 
both microscopy and PCR in cross-sectional surveys, has shown that sub-mi- 
croscopic low-density infections are common across a range of transmission 
settings’"*. These infections may be chronic and asymptomatic, particularly 
in previously exposed individuals with more mature immune responses. More 
importantly, even at low parasite densities, they are still capable of infecting 
mosquitoes and seeding onward transmission’. Even though RDTs are be- 
coming more common in areas where these types of infections are prevalent, 
studies formally evaluating their performance in detecting asymptomatic in- 
fections remain scarce. 

Recently, there has been an increased focus on developing improved di- 
agnostics to inform malaria elimination strategies. The analysis presented in 
this paper aims to determine the concordance of current malaria diagnostic 
methods, forming a baseline to evaluate further how they can be improved 
to inform malaria control and elimination strategies. It should be noted that, 
in principle, quantifying the presence of gametocytes is considered the most 
accurate method for characterizing transmission and the potential infectious- 
ness of individuals. Research in this area is ongoing, but the technical chal- 
lenges of existing gametocyte assays preclude them from standardized use'®. 
Moreover, all malaria infections have the capacity to produce gametocytes'”"®, 
Therefore, in the context of community chemotherapy programmes, any in- 
dividual who tests positive for asexual parasites should be treated to reduce 
transmission. Given this operational framework, this paper does not address 
the role of diagnostics that specifically measure gametocytaemia. 

So far, no studies have comprehensively evaluated the concordance across 
PCR, RDT and microscopy detection methods simultaneously in asymptomatic 
populations. Although microscopy- and PCR-measured prevalences are based 
on similar biological endpoints (parasite density), diagnostic results based on 
RDTs are less comparable given that HRP2 and pLDH are indirect measures 
of parasite biomass”. HRP2 can persist in the blood for up to two weeks after 
parasite clearance*°. Consequently, results across these diagnostic methods 
indicate a range of possible infection states, from patent or sub-microscopic 
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infection to recently cleared infection (Fig. 1). A limited number of studies have 
reviewed the detection capability of RDTs in asymptomatic individuals®*!, but 
key research questions still remain. A recent analysis of Demographic and 
Health Surveys (DHS) across Africa showed a higher prevalence of malaria 
when measured by RDTs compared with detection by microscopy in 19 out 
of 22 surveys. This report also highlighted the issue of false positives owing 
to prolonged presence of HRP2 after parasite clearance?’. However, studies 
have not reviewed the detection capability across all three diagnostics. Fur- 
thermore, the DHS study only considered children under 5 years of age and 
did not determine the effect of malaria transmission intensity on diagnostic 
discordance. This is particularly important given that low-density infections 
seem to be most common in adults and in low-transmission settings". 

In this study, we determine the relationship across malaria prevalence 
measures obtained by current diagnostic methods — PCR, RDT and micros- 
copy — for the detection of P. falciparum infections in endemic populations 
based on a pooled analysis of published and unpublished cross-sectional data. 


METHODS 

Literature review and data collection. We carried out two separate literature 
reviews to identify studies in which P. falciparum prevalence was measured by 
different diagnostic techniques in the same individuals: first, by RDT and mi- 
croscopy, and, second, by RDT and PCR. Relevant studies were identified in 
PubMed and Embase, using MeSH and Map terms when possible. For the RDT 
and microscopy review, the search terms were: “‘rapid diagnostic test’ and 
‘microscopy’ [MeSH/Map] and ‘malaria falciparum’ [MeSH/Map]", and for 
he RDT and PCR review the search terms were: “polymerase chain reaction’ 
[MeSH/Map] and ‘malaria falciparum’ [MeSH/Map]". Searches were limited 
o English, human and post-2005 (considering the substantial development in 
RDTs over time). For Embase, the searches were also limited to journal ar- 
icles. Inclusion criteria were applied as previously described". In short, only 
studies that were cross-sectional (on populations not selected according to 
malaria test results or symptoms), that were of populations from a malaria en- 
demic region, that used RDTs targeting P. falciparum only or mixed infections 
(HRP2 and/or pLDH) and that used PCR or loop-mediated isothermal amplifi- 
cation (LAMP) methods were included. For intervention studies, only baseline 
data were included, except for treatment studies where a sufficient amount of 
time had passed between last treatment and follow-up. Separate publications 
that used the same data set or measured 0% prevalence by both methods were 
removed, as well as data from clusters with fewer than five individuals. RDT and 
microscopy studies identified in our literature search that also included PCR 
measurements were included in the RDT and PCR data set, and vice versa for 
RDT and PCR studies that included microscopy measurements. In addition to 
he literature review, we sought as many individual-level data sets as possible 
rom studies with the above inclusion criteria. 


RDT and microscopy. Where available, information on location, sample size, 
RDT brand and type (HRP2 or pLDH), age group (15 or younger compared with 
older than 15) and prevalence estimates were recorded®?***, Furthermore, data 
rom the DHS online database were extracted*’. These included individual-level 
data on location and timing of collection, RDT and microscopy test results, RDT 
brand?!, age, sex, use of an ITN, fever and antimalarial use in the past two weeks. 
n addition, individual-level data sets from one unpublished and one published 
study were included“, as well as shared data sets of the RDT and PCR compar- 
ison that also included microscopy measurements (see below)*>”. 


RDT and PCR. Corresponding authors of the 13 studies identified from the lit- 
erature search were contacted to request individual-level data in December 
2014 and reminders were sent out 4 weeks later. Of the contacted authors, 
six responded within the timeframe; five data sets were included*?-4749°°, and 
one data set had been destroyed for privacy compliance. Prevalence meas- 
ures and study information (including PCR method) were extracted as de- 
scribed above from the publications in the aforementioned literature search 
and the non-responders group, as well as included studies from the RDT 
and microscopy search that also reported PCR proportions?9773439404251-55, 
Four additional individual-level unpublished and published data sets were 
included**8, 
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Statistical analyses. We analysed the association between PCR- and 
RDT-measured prevalence, and microscopy- and RDT-measured prevalence by 
fitting a linear relationship on the log odds scale". Prevalence (on a scale of O 


to 1) was defined as e®)_ where log odds = log ( 2revalence_y 
dd e 1-prevalence 


L-+ellog odds) 


O,, 7 O,, + Oy () 


0, = 0 gi? By = Q,) (2) 


R 


0, = Oy, + Oy (3) 
Oy = 8 yt B, a 0,,,) (4) 


In Equations 1-4, QO, is the log odds of RDT-measured prevalence in trial i, 
QO, is the log odds of PCR prevalence, Ons is the log odds of microscopy-meas- 
ured prevalence, On: is the log odds ratio (OR) of RDT- to PCR-measured prev- 
alence (RDT:PCR; Equation 1) or RDT- to microscopy-measured prevalence 
(RDT:microscopy; Equation 3), Oi is the expected log OR of RDT:PCR prev- 
alence (Equation 2) or RDT:microscopy prevalence (Equation 4) when the log 
odds of PCR- or microscopy-measured prevalence is equal to the mean across 
trials, QO, and OQ, are the mean log odds of PCR- and microscopy-measured 
prevalence, respectively, across trials, and B, is the regression coefficient. To al- 
low for varying sample size and sampling variation across the surveys included 
in our analysis, the model was fitted using Bayesian Markov Chain Monte Carlo 
methods in JAGS version 3.4.0 and the rjags package in R version 3.0.2 (ref. 
13). We also explored fitting polynomial relationships, but these provided no 
substantial improvement in fit to the data over the linear model as assessed by 
deviance information criterion, nor were these fitted relationships qualitatively 
different (data not shown). To confirm that the fitted curves at different prev- 
alence ranges were not overly influenced by the high number of data points in 
lower transmission areas, we fitted separate relationships in three PCR-meas- 
ured prevalence bands: <5%, 5-20% and >20%. These categories represent 
approximate cut-offs that have been suggested as thresholds for operational 
decision-making. Broadly speaking, programmes can begin to consider target- 
ed and focal control strategies when parasite prevalence by microscopy falls 
below 5% (ref. 57), which translates to a PCR-measured prevalence of 20% 
(ref. 14), and move towards targeted elimination when it falls below 1% (ref. 
58) (5% PCR-measured prevalence"). 

We also conducted a meta-analysis of the risk ratio between RDT:PCR 
prevalence or RDT:microscopy prevalence, adjusted for random effects at the 
study level (for RDT:PCR) or country level (for RDT:microscopy). Studies that 
reported zero infections by either diagnostic method were assigned a value of 
0.01 to allow a risk ratio to be calculated. To evaluate the effect of explanatory 
factors on discordant test results, individual-level data were analysed by logis- 
tic regression, allowing for random effects at the study or country level as noted 
above. The meta-analysis was done with the metafor package in R version 3.0.2, 
and the logistic regression with the logit command in STATA version 13. 

We assessed the ability of our models to predict RDT-measured preva- 
lence based on microscopy- or PCR-measured prevalence data. Leave-one-out 
cross validation was used to evaluate the RDT:PCR and the RDT:microscopy 
models separately. The data available for direct comparison of malaria detec- 
tion by RDT and PCR in the same individuals were sparse relative to the quan- 
tity of data available for the RDT:microscopy and previous microscopy:PCR 
comparisons. Therefore, we also triangulated the relationship between RDT- 
and PCR-measured prevalence by combining the RDT:microscopy relationship 
calculated in this study with the microscopy:PCR prevalence relationship that 
has been previously defined’. The credible interval of the triangulation line 
was computed from the posterior distributions of all the parameters from both 
equations combined. We evaluated whether this triangulated RDT:PCR rela- 
tionship was significantly different from the observed RDT:PCR relationship 
using the posterior distributions of the predictions from each model. 


RESULTS 

Literature search and data collection. The literature search generated 549 re- 
sults in Pubmed and an additional 37 in Embase for RDT and microscopy, and 
2,247 results in PubMed and an additional 426 in Embase for RDT and PCR. In 
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Figure 2 | The relationship between rapid diagnostic tests (RDTs) and microscopy 
Plasmodium falciparum prevalence overall (a) and stratified by age group (b). 

In b red indicates children (those under 15 years) and yellow indicates adults 
(those over 15 years). Dashed lines indicate the expected relationship if RDT and 
microscopy detected equal prevalence. Horizontal and vertical lines indicate 
95% confidence intervals around point estimates, whereas coloured solid lines 
indicate the median of the Bayesian posterior distributions from the fitted model 
and shaded areas indicate 95% credible intervals. Radius of point estimates 
indicate cluster size (from small to large: <100, 100-1,000 and >1,000). 


total, 20 RDT: microscopy studies and 13 RDT:PCR studies from the literature 
search met our inclusion criteria. Combined with additional data sets from DHS 
and unpublished studies, the pooled data available for evaluation yielded 323 
pairs of prevalence estimates for RDT and microscopy??*244-*? and 162 pairs 
for RDT and PCR*977343940.42.45-55. The extracted proportions together with the 
main characteristics of the studies from our literature search are provided in 
the Supplementary Information. The main PCR method used was nested PCR 
(nPCR; 15 of 20) of which mainly the Snounou method’? was used (11 of 15). 
The other methods included LAMP (1 of 20) and qPCR (4 of 20). All of the in- 
cluded RDTs in both comparisons were based on HRP2, with 8 out of 20 studies 
also including pLDH to measure species other than P. falciparum. However, this 
study only focuses on the detection of P. falciparum infections. 


Comparison of RDT- and microscopy-measured prevalence. Analysis of 
RDT- and microscopy-measured prevalence included data from 172,281 indi- 
viduals who were tested with RDTs (cluster prevalence range = 0-92%) and 
186,434 tested with microscopy (cluster prevalence range = 0-87%). The 
323 geographical clusters spanned a total of 29 countries (cluster size range = 
5-7,664). Overall, prevalence of P. falciparum measured by microscopy detect- 
ed 87% (95% confidence interval (Cl) = 74-102%) of RDT-positive infections. 
Therefore, RDT and microscopy detection was comparable (Fig. 2a, Table 1), 
with less of a difference between the two diagnostic methods in children under 
15 years of age (77%, 95% Cl = 71-85%) compared with adults (over 15 years) 
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Table 1 | Best fit relationships between RDT:microscopy and RDT:PCR prevalence. 


RDT:microscopy 


log odds RDT prevalence = 0.108 + 0.907 x log odds microscopy prevalence (all ages) 


log odds RDT prevalence = 0.109 + 0.908 x log odds microscopy prevalence (under 15 years) 


log odds RDT prevalence = -0.168 + 0.890 x log odds microscopy prevalence (over 15 years) 
RDT:PCR 


log odds RDT prevalence = -0.968 + 1.186 x log odds PCR prevalence (all ages) 


log odds RDT prevalence = -0.382 + 1.306 x log odds PCR prevalence (under 5 years) 
log odds RDT prevalence = -0.864 + 1.213 x log odds PCR prevalence (6-15 years) 


log odds RDT prevalence = -1.378 + 1.300 x log odds PCR prevalence (over 15 years) 


log odds RDT prevalence = 1.097 + 1.690 x log odds PCR prevalence (<5% prevalence) 


log odds RDT prevalence = 0.211 + 1.754 x log odds PCR prevalence (5-20% prevalence) 


log odds RDT prevalence = -0.516 + 1.904 x log odds PCR prevalence (>20% prevalence) 


log odds PCR prevalence = 0.108 + 0.907 x [(log odds RDT prevalence - 0.954)/0.868] 


PCR, polymerase chain reaction; RDT, rapid diagnostic test. 


(60%, 95% Cl = 48-86%) (Fig. 2b, Table 1). The lower age-specific risk ratios 
are due to smaller cluster sizes after stratifying the data by age group. However, 
regression analysis of individual-level data did not show a significant associa- 
tion between age group and test discordance (Supplementary Table 1). 


Effect of individual level covariates on RDT:microscopy discordance. |n addi- 
tion to age, we explored the effect of several other covariates on diagnostic 
outcomes, and adjusted for transmission intensity as assessed by microsco- 
py-measured prevalence (Supplementary Table 1). A significant association 
was seen between self-reported antimalarial use in the two weeks before 
survey testing and RDT positivity in individuals who tested negative by mi- 
croscopy (OR = 1.71, 95% Cl = 1.16-2.51, p = 0.006). The presence of fever 
at the time of testing (recorded temperature with study-specific cut-off 
or self-reported) reduced the odds of undetected malaria infection by RDT 
among microscopy-positive individuals (OR = 0.59, 95% Cl = 0.39-0.89, 
p<0.001). Among individuals testing negative by microscopy, presence of a 
fever was significantly associated with RDT positivity (OR = 1.84, 95% Cl = 
1.51-2.24, p<0.001), after adjusting for transmission intensity. There was a 
borderline significant increased risk of malaria infection being undetectable 
by RDT among those who used an ITN and were microscopy positive (OR = 
1.26, 95% Cl = 1.00-1.59, p = 0.053), whereas use of an ITN was associated 
with decreased RDT positivity (OR = 0.84, 95% Cl = 0.73-0.97, p = 0.019) 
among microscopy-negative individuals. There was no evidence of an asso- 
ciation between RDT brand and the risk of an undetected malaria infection 
by RDT among microscopy-positive individuals. Among microscopy-negative 
individuals, the proportion testing positive was different between RDT brands, 
but these results are difficult to interpret, owing to complete correlation be- 
tween study and RDT brand. The year of the survey was not associated with 
discordant test results for RDT:microscopy. 


Comparison of RDT- and PCR-measured prevalence. Analysis of RDT- and 
PCR-measured prevalence included 35,887 individuals tested with an RDT 
(cluster prevalence range = 0-45%) and 31,178 individuals tested with PCR 
(cluster prevalence range = 0-52%). There were a total of 162 geographical 
clusters across 17 countries (cluster size range = 5-3,307, Figs 3a,b and Table 
1). Pooled meta-analysis across all surveys showed that RDTs detected an av- 
erage of 41% (95% Cl = 26-66%) of PCR-positive infections. This primarily 
reflects the relationship between RDT and PCR in low-transmission settings, 
with an average PCR prevalence of 8% across all the clusters included in our 
analysis. 
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Figure 3 | The relationship between rapid diagnostic test (RDT) and polymerase 
chain reaction (PCR) prevalence overall (a) and zoomed in for <20% PCR 
prevalence (b). Blue, observed RDT:PCR prevalence data and model fit; pink, 
the triangulated RDT:PCR comparison (see methods); grey, the PCR:microscopy 
comparison from ref. 13. Dashed lines indicate the expected relationship if RDT 
(or microscopy) and PCR detected equal prevalence. Horizontal and vertical 
lines indicate 95% confidence intervals around point estimates, whereas 
coloured solid lines indicate the median of the Bayesian posterior distributions 
from the fitted model and shaded areas indicate 95% credible intervals. Radius 
of point estimates indicate cluster size (from small to large: <100, 100—1,000 and 
>1,000). 


Age, transmission intensity and undetected malaria infection by RDT. As 
with the relationship between RDT- and microscopy-measured prevalence, 
stratifying by age group improved the model fit to the data, showing a de- 
crease in detectability by RDT with increasing age (Figs 4a-c). Meta-analysis 
of the risk ratio between RDT and PCR positivity showed that, for children un- 
der 5 years of age, RDTs detected 81% (95% Cl = 74-89%) of PCR-positive in- 
fections. By comparison, RDTs detected fewer PCR-positive school-aged indi- 
viduals (6-15 years) (70%, 95% Cl = 57-86%), and even fewer among adults 
over 15 years of age (49%, 95% Cl = 31-78%). There was a larger data set 
available for analysis in the under 5 (140 clusters) and 6-15 (136 clusters) age 
groups compared with adults (81 clusters), suggesting that additional data in 
the higher age group could help to improve the accuracy of these estimates. 
Previous studies have suggested that the proportion of carriers with 
sub-microscopic infections decreases in areas of higher transmission intensi- 
ty, potentially because of an association with re-infection and increased para- 
site density’*"*. A similar trend was also observed in the relationship between 
RDT and PCR detectability. The fit to our data was improved after stratify- 
ing by transmission intensity based on PCR-measured prevalence, showing 
increased RDT sensitivity compared with PCR as transmission increases 
(Fig. 4d-f). However, meta-analysis of the risk ratio between RDT and PCR 
positivity did not show a significant difference between the three transmission 
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ranges, possibly indicating that more data are needed to define a more robust 
relationship for each transmission setting. 

Figure 5 shows RDT detectability as a proportion of PCR-positive individ- 
uals, stratified by age and transmission intensity. Irrespective of transmission 
intensity, adults have the highest percentage of RDT-undetectable infections. 
By contrast, the percentage of individuals with RDT-detectable infections in 
all age groups increases as transmission intensity increases. However, since 
infection rates are greater at high-transmission intensities, RDTs may still miss 
a larger absolute number of infectious individuals at this level of endemicity. 
Best-fit model estimates of PCR-measured prevalence based on RDT-meas- 
ured prevalence are summarized in Figs 3, 4 and Table 1. 


Effect of individual-level covariates on RDT:PCR discordance. We evaluated the 
impact of age and transmission intensity on RDT positivity among PCR-neg- 
ative individuals as a potential indicator of prolonged HRP2 clearance time. 
Logistic regression, adjusted for cluster PCR-measured prevalence, showed 
that among PCR-negative individuals, school-aged children had a significantly 
higher RDT positivity (OR = 1.53, 95% Cl = 1.28-1.82, p<0.001) when compared 
with a baseline of children under 5 years of age. Adults showed similar odds of 
being RDT positive (OR = 1.00, 95% Cl = 0.64-1.58, p = 0.990) as those under 
5 years. Infections that were undetected by RDT, based on PCR positivity, were 
highest in adults (OR = 5.04, 95 %CI = 4.114-6.13, p<0.001) compared with 
those under 5 years, with a similar risk in school-aged children and those under 
5 years (Supplementary Table 2). 

RDT positivity among PCR-negative individuals varied between RDT 
brands, as did the detection of infection in PCR-positive individuals, but these 
results were not significant. Patients with a fever were less likely to have un- 
detected infections by RDT if they were PCR positive (OR = 0.14, 95% Cl = 
0.06-0.32, p<0.001), but also more likely to have a RDT-positive result if they 
were PCR negative (OR = 4.86, 95% Cl = 2.29-10.30, p<0.001). More recent 
surveys showed a lower risk of RDT-undetected infections, based on PCR pos- 
itivity (OR = 0.77 per year, 95% Cl = 0.60-0.99, p = 0.044), which may indi- 
cate an improved performance of RDTs over time. PCR method was associated 
with test discordance at borderline significance, with RDTs detecting less PCR 
positive results measured by qPCR than those measured by PCR (OR = 1.92, 
95% Cl = 0.98-3.74, p = 0.056), reflecting higher sensitivity of qPCR, as de- 
scribed previously'**°, 


Model validation. From the leave-one-out analysis, the correlation coefficient 
between observed and predicted values of RDT-measured prevalence from the 
RDT:PCR model was 0.67, indicating a moderate agreement. The correlation 
coefficient between observed and predicted values of RDT-measured preva- 
lence from the RDT:microscopy model was 0.92, indicating a relatively stronger 
agreement (Fig. 6). The credible interval of this triangulated relationship was 
narrower than that of the directly observed line, owing to the larger number 
of data points in the RDT:microscopy and microscopy:PCR data sets (Figs 2, 
3, Table 1). There was no significant difference between the triangulated and 
observed relationships at any transmission intensity. 


DISCUSSION 

As the burden of malaria continues to decline in many regions’, it is crucial 
to understand the suitability of diagnostics for use in low-transmission and 
near-eliminating areas where MSAT and MDA strategies are likely to be 
applied. More specifically, how will diagnostic accuracy affect the ability 
of MSAT programmes to detect and treat asymptomatic individuals or de- 
termine local malaria prevalence thresholds for the initiation of MDA? Our 
study results show that the detection capability of RDTs is comparable with, 
and often greater than, microscopy. On average, microscopy captured 87% 
of RDT-positive infections, with higher test concordance in children than in 
adults. The extent to which this higher RDT detection reflects increased sen- 
sitivity, lack of specificity, or both, is unclear. Compared with molecular detec- 
tion methods, however, RDTs still miss a substantial proportion of infections, 
capturing only 41% of PCR-positive individuals in low-transmission settings. 
Our analysis included cross-sectional data with paired prevalence measures 
by either RDT and microscopy or RDT and PCR from more than 180,000 indi- 
viduals, spanning more than 400 geographical clusters. The detection levels 
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Figure 4 | The relationship between rapid diagnostic test (RDT) and polymerase 
chain reaction (PCR) prevalence by age group (a-c) and PCR prevalence band 
(d-f). The Bayesian model was fitted separately for each age group or PCR 
prevalence band. Age groups are younger than 5 years (a) 6-15 years (b) and 
older than 15 years (c). PCR prevalence bands are <5% (d), 5—-20% (e) and >20% 
(f). Dashed lines indicate the expected relationship if RDT and PCR detected 
equal prevalence. Horizontal and vertical lines around point estimates indicate 
95% confidence intervals, whereas coloured solid lines indicate the median of 
the Bayesian posterior distributions from the fitted model and shaded areas 
indicate 95% credible intervals of these fits. Radius of point estimates indicate 
cluster size (from small to large: <50, 50-100 and >100). 


observed differed depending on age and transmission intensity, reflecting 
complex dynamics at both the ecological and host level that may influence 
parasite densities and the relative performance of these diagnostics. 

Factors correlated with the accuracy of RDTs are varied and likely to be 
driven by subtleties in the concentration and duration of HRP2 antigens in 
peripheral circulation. A lower specificity by RDT is expected given that, in 
addition to current infection, they can detect recent infection owing to resid- 
ual HRP2 even after parasite clearance. Our analysis found that RDTs had a 
higher positivity rate than microscopy among those who were more likely to 
have current or recent high parasite densities — children, those with measured 
or reported fever and those recently treated with antimalarial drugs. This may 
indicate that high parasite densities and, therefore, ruptured schizonts (asex- 
ual parasites that replicate to form multiple red blood cell invading parasites), 
lead to increased and/or prolonged HRP2 levels. These levels are likely to vary 
depending on an individual's clinical status and stage of infection owing to as- 
sociated fluctuations in parasite density. Because RDTs have been designed for 
clinical use, it is intuitive that their performance would be optimal in the detec- 
tion of high-density infections associated with symptomatic disease. A previ- 
ous analysis evaluating the sensitivity of RDTs and microscopy, specifically in 
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Figure 5 | Rapid diagnostic test (RDT) detectable (darker colours) and undetectable (lighter colours) infections based on polymerase chain reaction positive 
(PCR+) infections by age (under 5 years, 6-15 years and older than 15 years) and transmission intensity (PCR prevalence <5%, 5-20% and >20%). The height 
of the bars for RDT detectable and undetectable proportions reflects the total prevalence of infection in that group according to PCR, whereas the width 
of the bars shows the proportion of the population in each age group in most African settings (younger than 5 years (blue), 15%; 6-15 years (red), 35%; and 


over 15 years (green), 50% of the total population”). 


individuals with clinical symptoms, found an association between parasite den- 
sity and RDT positivity®°. This study also stressed the issue of false positives 
and how RDT specificity, in addition to being influenced by parasite density, 
may be correlated with age and transmission intensity. Further investigation 
into how RDT accuracy varies between clinical and subclinical populations 
could help to elucidate the factors that drive these differences. Our analysis 
also found that using an ITN was associated with better concordance of RDT 
and microscopy results, most probably due to a lower risk of infection. This dis- 
tinction is particularly relevant for elimination strategies, because an RDT-pos- 
itive and microscopy-negative result after parasite clearance may still indicate 
recent transmission in a population, whereas absence of infection does not. 
In general, it should be noted that the quality of microscopy is likely to vary 
more widely than that of RDTs. Microscopy in the context of research surveys 
is more accurate than those typically encountered during routine surveillance. 
Therefore, the relative sensitivity of these diagnostics may be more discordant 
in programmatic settings than the relationship observed in this study. 

Our analysis also found a number of factors that correlated with detec- 
tion by RDT and PCR. Previous studies have demonstrated that the proportion 
of carriers with sub-microscopic infections decreases in areas of high-trans- 
mission intensity, potentially associated with superinfection (new malaria 
infection in already infected individuals)'*". This trend was also observed in 
our analysis — the proportion of PCR-measured infections that were detect- 
ed by RDT increased with higher transmission intensity. Although the inter- 
action between infection, immunity and parasite density in these settings is 
not fully understood, it has been suggested that only partial cross-immunity 
is acquired against malaria parasite clones. Greater multiplicity of infection 
in higher transmission settings could result in higher parasite densities if host 
immune systems cannot respond to the diversity of parasites or if parasites 
increase growth rates in the presence of competing clones. In addition to 
transmission intensity, we also observed age-associated variations in RDT de- 
tection. Our analysis shows that, after adjusting for transmission intensity, the 
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odds of having an RDT-undetectable infection in adults was fivefold higher 
compared with under 5 year olds, potentially owing to more enhanced im- 
mune responses in adults that suppress parasite proliferation. This finding 
coincides well with data that show a lower sensitivity of microscopy relative 
to PCR among adults®. In addition, among PCR-positive individuals, the odds 
of a positive RDT result was seven times higher in patients with a fever. Over- 
all, these results emphasize that fever, superinfections and childhood infec- 
tions are commonly associated with high parasite densities, which, in turn, 
may lead to higher HRP2 levels that persist after parasite clearance. A num- 
ber of studies have shown a relationship between parasite biomass and HRP2 
clearance time®*®°. However, these studies were predominantly in areas of 
high-density infections; studies in areas of lower parasite densities are less 
conclusive®'. Moreover, HRP2 concentrations may be influenced by duration 
of infection, parasite sequestration and HRP2 antibody responses”. Therefore, 
characterizing HRP2 detection profiles at parasite densities that are more typ- 
ically found in elimination settings can help to better gauge the accuracy of 
RDTs in these areas. Our results also showed that risk of an RDT-positive and 
PCR-negative test result was higher in school-aged children compared with 
children under 5 and adults. This may be further evidence for an association 
between age and recent high parasite density (approximately 2-4 weeks), but 
may also suggest that infections can fall below the detection limit of PCR and 
still be captured by RDTs. RDT results that are typically presumed to be false 
positives may be advantageous when the identification of a recent as well as 
a current infection is needed, such as in elimination settings, or if HRP2 is 
still measurable during periods of fluctuating parasite density that drop below 
the molecular detection threshold. An improved understanding of RDT per- 
formance relative to PCR methods of various sensitivities, such as GPCR and 
LAMP, could help to further benchmark the range at which RDTs can optimally 
operate. Although the impact of the PCR method on test sensitivity has been 
investigated in previous studies“, more data are required to evaluate this rela- 
tive to RDT sensitivity in more detail. 
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Figure 6 | The relationship between observed and predicted rapid diagnostic 
test (RDT) prevalence from the RDT:microscopy comparison (a), and the 
RDT:polymerase chain reaction comparison (b). Predictions were obtained using 
leave-one-out cross-validation. 


We were able to define a more robust model for the relationship between 
prevalence measured by RDT compared with microscopy, than for the rela- 
tionship between prevalence measured by RDT compared with PCR. This is 
because a more comprehensive data set of comparative RDT and microsco- 
py measures was available across a wider range of transmission intensities. 
Medium- to high-transmission settings were particularly under-represented 
in the comparison of RDT and PCR measures. With more than half of our data 
from <5% PCR prevalence settings (57%; 93 of 162 clusters), the RDT:PCR 
relationship described here primarily reflects RDT performance at low-trans- 
mission intensity. However, the relationship between RDT- and PCR-meas- 
ured prevalence estimated from directly observed paired data was not sta- 
tistically different from the RDT:PCR relationship estimated by triangulating 
the RDT:microscopy and microscopy:PCR relationships based on independent 
data sets, improving confidence in our findings. Additional covariate informa- 
tion in future studies would further explain other factors that influence diag- 
nostic sensitivity. Although we included RDT brand as a covariate in both the 
RDT:microscopy and RDT:PCR models, studies in this meta-analysis were not 
collected specifically to evaluate RDT brand so data are not sufficiently repre- 
sentative to draw conclusions on its impact on diagnostic sensitivity. 

Overall, this study has established the relative detection capabilities of 
existing diagnostics for the identification of asymptomatic individuals infected 
with P. falciparum. To inform community chemotherapy programmes, however, 
further analysis is needed to determine to what extent these individuals con- 
tribute to onward transmission. As with detection, the potential infectiousness 
of asymptomatic individuals is sensitive to fluctuations in parasite density over 
the course of an infection and by season'®®°. These are driven by the maturity 
of the host's immune response, which may vary by age and by local trans- 
mission dynamics, such as seasonality, that can influence population-level 


S92 


immunity or within-host parasite behaviour. Therefore, defining infectivity in 
relation to parasite density is especially important; this is addressed further 
by Slater and colleagues in a companion paper in this supplement®. Once the 
contribution of asymptomatic individuals to the infectious reservoir is better 
defined, future analyses should ideally establish optimal detection limits of 
new diagnostics for use in control and elimination strategies. 
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Mass-screen-and-treat and targeted mass-drug-administration strategies are being considered as a means to interrupt transmission 
of Plasmodium falciparum malaria. However, the effectiveness of such strategies will depend on the extent to which current and 
future diagnostics are able to detect those individuals who are infectious to mosquitoes. We estimate the relationship between par- 
asite density and onward infectivity using sensitive quantitative parasite diagnostics and mosquito feeding assays from Burkina Faso. 
We find that a diagnostic with a lower detection limit of 200 parasites per microlitre would detect 55% of the infectious reservoir 
(the combined infectivity to mosquitoes of the whole population weighted by how often each individual is bitten) whereas a test 
with a limit of 20 parasites per microlitre would detect 83% and 2 parasites per microlitre would detect 95% of the infectious reser- 
voir. Using mathematical models, we show that increasing the diagnostic sensitivity from 200 parasites per microlitre (equivalent to 
microscopy or current rapid diagnostic tests) to 2 parasites per microlitre would increase the number of regions where transmission 
could be interrupted with a mass-screen-and-treat programme from an entomological inoculation rate below 1 to one of up to 4. 
The higher sensitivity diagnostic could reduce the number of treatment rounds required to interrupt transmission in areas of lower 
prevalence. We predict that mass-screen-and-treat with a highly sensitive diagnostic is less effective than mass drug administration 
owing to the prophylactic protection provided to uninfected individuals by the latter approach. In low-transmission settings such as 
those in Southeast Asia, we find that a diagnostic tool with a sensitivity of 20 parasites per microlitre may be sufficient for targeted 
mass drug administration because this diagnostic is predicted to identify a similar village population prevalence compared with that 
currently detected using polymerase chain reaction if treatment levels are high and screening is conducted during the dry season. 
Along with other factors, such as coverage, choice of drug, timing of the intervention, importation of infections, and seasonality, the 


sensitivity of the diagnostic can play a part in increasing the chance of interrupting transmission. 
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lasmodium falciparum malaria was responsible for an estimated 

584,000 (range 367,000-755,000) deaths in 2013, most of which oc- 

curred in young children in sub-Saharan Africa’. Although the burden 
has reduced in response to global efforts to increase the provision of prov- 
en malaria interventions such as insecticide-treated bed nets and access to 
health care and treatment’, it remains high. One of the challenges in reducing 
malaria transmission is the long duration of infection in the human host, which 
in semi-immune individuals may persist for a year or more. In particular, al- 
though infection often leads to disease in naive individuals, those with suf- 
ficient acquired immunity can harbour parasites — and hence be onwardly 


infectious to mosquitoes — without exhibiting symptoms?. One option for 
speeding the decline in transmission could be to target the asymptomatic 
reservoir of infection* by providing either periodic mass-screen-and-treat 
(MSAT) programmes, focal MSAT or a reactive strategy in which individu- 
als living in the vicinity of an identified clinical case are screened and treated. 
However, the extent to which such strategies are able to reduce the infectious 
reservoir will depend on the extent to which the diagnostic used to identify 
infected individuals also detects those who are onwardly infectious. Anoth- 
er form of targeting could take place at the population level (for example a 
village) where mass interventions are deployed if the population prevalence 
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Figure 1 | Percentage of infected individuals and the infectious reservoir detected. a, Parasite density distributions of the three infection groups defined in the 
statistical analysis section. b, Proportion of infections detected for each infection group for a range of diagnostic thresholds between 0.001 and 10’ parasites per 


microlitre. For each value of the x-axis we calculate the proportion of each density distribution (from a) that would be detected. c, The proportion of the infectious 
reservoir of the whole population that would be detected for each diagnostic threshold. This is the combined infectivity to mosquitoes of all individuals with asexual 
parasite densities above the diagnostic threshold weighted by body size. The dashed vertical lines show the three detection thresholds considered: 200, 20 and 2 
parasites per microlitre. d, Proportion of the infectious reservoir detected. Comparison of the observed data as shown in ¢ to a simulation-based approach using 
OpenMalaria. We simulated the Burkina Faso setting using OpenMalaria to see how well the distributions of parasite density and the contribution to the infectious 
reservoir could be captured. We assumed that the seasonality followed that reported in Burkina Faso* with 30 infectious bites per person per year in the village of 
Laye and 300 in the village of Dapélogo, and we assumed that the coverage of treatment for malaria fevers was low. The assumptions about transmission intensity 


and case management were not crucial to predictions in this range. We weighted each individual’s contribution to the infectious reservoir by their body surface 

area for both the observed data and predictions to account for differential biting rates. The simulated individuals have the same age- and village-distribution as 

the observed data. The asexual densities were measured using quanitative nucleic acid sequence based amplification (QT-NASBA) whereas OpenMalaria output 
densities were calibrated to microscopy using the standard method of counting parasite against leukocytes and converting them, assuming 8,000 leukoytes per 
microlitre*®. The agreement between microscopy and QT-NASBA densities is not perfect; however, we assume for the purposes of this validation that they are similar 


because they are two imperfect methods to measure the density of infection. 


exceeds a given threshold. In this case, the choice of diagnostic will influence 
the accuracy of the targeting and the choice of detected prevalence threshold 
above which the intervention occurs. 

Historically, microscopy was the most commonly used diagnostic, both 
in the clinic to confirm cases, and in the field to estimate the prevalence of 
asymptomatic infection. Although detection thresholds of 4-20 parasites per 
microlitre are achievable using a Giemsa-stained thick blood film in a con- 
trolled laboratory setting®, thresholds of 100-200 parasites per microlitre are 
more common in field settings®. More recently, rapid diagnostic tests (RDTs) 
have been introduced in the field. These tests detect malaria antigens, most 
commonly histidine-rich protein 2 (HRP2) and plasmodium lactate dehydro- 
genase (pLDH), which are antigens produced by most, but not all, P. falciparum 
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parasites’°. These diagnostics are easier to use and more reliable than mi- 
croscopy, but have similar detection thresholds'°". The advent of highly sensi- 
tive molecular methods that detect parasite DNA or RNA such as polymerase 
chain reaction (PCR) and quantitative nucleic acid sequence based amplifi- 
cation (QT-NASBA) has highlighted that many infected individuals have para- 
site densities that are too low to be detected either by microscopy or RDTs®"4, 
However, even these methods do not detect all the infections that are pres- 
ent in infected human hosts’*"> and they are currently not available as rou- 
tine field diagnostic tests because they require high-level laboratory facilities 
for nucleic acid extraction, amplification and detection that are unavailable 
in many resource-poor endemic settings. Even the most operationally attrac- 
tive molecular diagnostic available — loop-mediated isothermal amplification 


S95 


NEXT GENERATION RDTS | SLATER ET AL. 


Table 1 | Gametocyte density, prevalence and proportion of mosquitoes infected for the three infection groups: clinical disease, asymptomatic detectable infection 
and asymptomatic undetectable infection, and contribution to the infectious reservoir (total infectivity weighted by body surface area). 


Compartment and percentage of 
total infected population 


Gametocyte densities 
(parasites per microlitre) 


of gametocytaemic 
individuals (median and 
interquartile range) 


Proportion of individuals 
in compartment that are 
gametocytaemic (by QT- 
NASBA) 


Contribution to infectious 
reservoir 


Percentage of 
mosquitoes infected 

by all the individuals in 
compartment 


Clinical disease (4.4%) 290.9 (181.7, 1,169.7) 100% 


Asymptomatic undetectable (53.4%) 29.8 (1.9, 128.6) 59% 


23.1% 9.0% 
12.71% 62.3% 
3.9% 28.7% 


The clinical disease group consists of the 4.4% of individuals who have the highest parasite densities. Asymptomatic detectable individuals are those that are detectable by QT-NASBA and microscopy, and the 


asymptomatic undetectable individuals are those detectable by QT-NASBA, but not microscopy. 


(LAMP), which involves a simple DNA extraction procedure, isothermal am- 
plification and visual examination of positivity — is currently only used in re- 
search settings'®. 

Two recent MSAT trials, which used RDT screening to identify infections, 
failed to lead to a sustained reduction in malaria parasite prevalence or dis- 
ease incidence’. It has been suggested that this could be at least partly due 
to ongoing transmission from low-density infections that were undetectable 
by the RDT or by microscopy. For example, in one MSAT study in Zanzibar, 
just 0.2% of people tested positive by RDT and received an antimalarial in two 
rounds 1 month apart, whereas 2.5% and 3.8% of people in the two rounds, 
respectively, were positive by PCR. RDT, therefore, detected just 8% and 5.3% 
of PCR-positive infections in each of the rounds, respectively’. A follow-up 
study in this setting found that LAMP was able to detect 3.4 times more in- 
fections than RDT”. Village-level targeted mass drug administration (MDA) 
is being trialled in the Greater Mekong subregion (Cambodia, Laos, Myanmar, 
Thailand, Vietnam and Yunnan Province in China) with a view to wider im- 
plementation”®. In this setting a random sample of villagers is being screened 
using a PCR to inform decisions on MDA implementation. 

To improve the impact of elimination strategies such as MSAT and target- 
ed MDA, researchers are developing RDTs with improved limits of detection. 
Trying to establish the appropriate target performance specifications for these 
tests raises two questions. First, how infectious are individuals with low par- 
asite densities (how much could they contribute to the infectious reservoir)? 
Second, could the use of a more-sensitive diagnostic increase the ability of an 
MSAT intervention to achieve local interruption of transmission? 

We address these questions from a population perspective using a com- 
bination of data analysis and mathematical modelling. Using a data set from 
a study of the human infectious reservoir for malaria in a high endemic site in 
Burkina Faso, we estimate the contribution to the infectious reservoir of indi- 
viduals with different parasite densities and hence the proportion of the infec- 
tious reservoir that could be detected across a range of diagnostic sensitivity 
thresholds. We then use three well-established mathematical models of ma- 
laria transmission to assess how these different diagnostic sensitivities could 
improve the prospect of malaria elimination in an African and an Asian context. 


METHODS 

Data. The data set used consists of information on 130 participants from four 
age groups (<5 years, 5-14 years, 15-30 years and >30 years) in the villages 
of Laye and Dapélogo in Burkina Faso’. Study participants provided 307 ve- 
nous blood samples at the start of the wet season (n = 104), the peak of the 
wet season (n = 100) and in the subsequent dry season (n = 103). Age- and 
village-matched replacements were sought if individuals were lost or refused 
follow-up visits. Only individuals with serious acute disease, including severe 
clinical malaria, were excluded from participation. Finger prick blood samples 
were used for the preparation of microscopy slides, of which 100 microscopic 
fields were double read for asexual parasites and gametocytes. Nucleic acids 
were extracted from 100 ul venous blood samples and QT-NASBA was used 
to detect all parasite stages based on 18S ribosomal RNA and specifically to 
detect mature gametocytes based on Pfs25 messenger RNA. The sensitiv- 
ity of QT-NASBA was set to 0.01-0.02 parasites per microlitre for both 18S 
rRNA and Pfs25 mRNA”. Parasite and gametocyte densities were estimated 
in relation to a standard asexual and gametocyte stage V dilution series”. 
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Venous blood samples of 3 ml were fed to locally reared 4- to 5-day-old female 
Anopheles gambiae sensu stricto mosquitoes“; 7 days after feeding a mean of 
39.3 (range 14-65) fully fed mosquitoes were dissected in 1% mercurochrome 
and screened for oocysts. 


Statistical analysis. To estimate the infectious reservoir, we categorized the 
sampled individuals into four groups: clinical disease, which was defined as 
individuals with the highest asexual parasite densities (>140,000 parasites 
per microlitre); asymptomatic detectable, which was defined as individuals 
who were positive using QT-NASBA and microscopy; asymptomatic unde- 
tectable, which was defined as individuals who were positive by QT-NASBA 
and negative by microscopy; and uninfected individuals who were negative by 
QT-NASBA and microscopy. We define positive QT-NASBA as all individuals 
positive for 18S because this indicates the presence of asexual parasite stages. 
For the purposes of this analysis, we make the assumption that QT-NASBA is 
the gold standard diagnostic. Of the 1,095 mosquitoes that fed on 28 individ- 
uals in the group negative by QT-NASBA and microscopy, only 1 became in- 
fected (<0.1% of mosquitoes, 3.5% of humans), indicating that the uninfected 
individuals have extremely low infectiousness to mosquitoes. For each of the 
other groups we calculated the mean gametocyte density of gametocytaemic 
individuals, the proportion that were gametocytaemic using QT-NASBA, the 
mean gametocyte density conditional on the presence of gametocytes and the 
proportion of fed mosquitoes infected. The contribution of each group to the 
infectious reservoir was calculated as the total percentage of mosquitoes in- 
fected by each group weighted by the number of people in each group and by 
the relative body surface area (based on age) of those individuals. Non-para- 
metric (Wilcoxon) tests were used to test statistical significance of differences 
in parasite densities between the groups. Differences in the proportions of ga- 
metocytaemic individuals and mosquitoes infected were assessed using logis- 
tic regression models using random effects to allow for repeated sampling of 
some of the same individuals in the different surveys. 

In the models, for a given diagnostic detection limit, the proportion of the 
infectious reservoir that is detectable is calculated as the combined onward 
infectivity to mosquitoes of all individuals with parasite densities above the 
threshold (weighted by body surface area) divided by the infectious reservoir 
of the total population. 


Mathematical models. Three established malaria transmission models were 
used to evaluate how different diagnostic thresholds would impact on the likely 
success of MSAT strategies. The first two models (the Imperial College mod- 
el and OpenMalaria, which was developed by the Swiss Tropical and Public 
Health Institute and the Liverpool School of Tropical Medicine) were used to 
explore the impact of MSAT in an African context. The third (mathematical and 
economic modelling (MAEMOD) developed by the Mahidol Oxford Tropical 
Medicine Research Unit) was used to explore the impact of targeted MDA in 
an Asian context. Full mathematical details of each of the models, the process 
of fitting to data and their validation against further field data are given in refs 4, 
25-33. We summarize how the models capture the infectious reservoir. In all 
three models, we define infectivity as the probability of an individual infecting 
a feeding mosquito, and the infectious reservoir is defined as the sum of the 
infectivity of the whole population multiplied by the probability of being bitten, 
which is based on the body surface area of the individual. 
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Figure 2 | Model predictions of the proportion of infected individuals in each class and their contribution to the infectious reservoir. Proportion of the infected 
population that have clinical disease (red), asymptomatic infection detectable by microscopy (orange) and asymptomatic infection not detectable by microscopy 
(yellow), and their contribution to the infectious reservoir for a range of entomological inoculation rates between 1 and 400. We predicted the proportion of infected 
individuals with infections detectable by microscopy for a range of transmission intensities with the three models, assuming the common inputs: stable transmission, 
constant seasonality, little case management and no other interventions present. The ‘gametocytes only’ compartment refers to individuals with recently cleared 


asexual parasites, but persisting gametocytes. 


Imperial College 

We used a deterministic age-structured population-level transmission model 
that tracks three infected compartments of the population — clinical disease, 
patent asymptomatic infection (identifiable by microscopy), and sub-patent 
(sub-microscopic) asymptomatic infection’. Onward infectivity to mosquitoes 
is highest in clinical disease, intermediate in patent asymptomatic infection 
and lowest in sub-patent infection parameterized by fitting to four earlier mos- 
quito feeding studies*®. As the model does not explicitly track parasite density, 
we adapted this structure to capture a wider range of diagnostic sensitivities by 
using results of the data analysis to define the proportion of infections that are 
detectable at a given parasite threshold within each of the infected compart- 
ments. An individual's infectivity is determined by their malaria infection status 
10 days previously, to capture lags in gametocyte production. 


OpenMalaria 

We used an individual-based stochastic comprehensive model of malaria ep- 
idemiology®34. In this model, a population of individuals is updated at 5-day 
time steps with model components representing new infections, parasite den- 
sities, acquired immunity, morbidity, mortality and infectivity to mosquitoes. 
The course of parasite densities over an infection in a non-immune individual 
is described using historical data from patients deliberately infected with P. 
falciparum as treatment for neurosyphilis. Immunity to asexual parasites is 
derived from a combination of cumulative exposure to both inoculations and 
parasite densities, and maternal immunity”°, and acts to reduce the densities. 
The model was fitted to aggregated data from several data sets but not to lon- 
gitudinal patterns of densities within individuals?®. Several different variants 
of this model are available. For this study we use the base model from the 
overall ensemble**. An individual’s infectiousness to mosquitoes is related to 
a weighted sum of their recent parasite densities 10, 15 and 20 days previous- 
ly, allowing time for gametocytes to develop and circulate?’. The probability 
of infecting a feeding mosquito is given by the probability that at least one 
male and one female gametocyte would be taken up in the blood meal, and 
the model is fitted to the data of artificial feeding carried out on the patients 
with neurosyphilis treated using malaria therapy and then scaled using field 
data to account for the difference between infectivity in experimental and field 
conditions**?’. An individual who has recently been treated or whose infection 
ended naturally may have no asexual parasites, but continue to be infectious 
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for up to 20 days. The proportion of individuals detected by a diagnostic test, 
and the corresponding contribution by these individuals to the infectious res- 
ervoir, is directly recorded. 


MAEMOD 


We used a deterministic model for the transmission of P. falciparum malar- 
ia similar to those previously described? with four infection classes: severe; 
clinical; asymptomatic and detectable by microscopy; and asymptomatic and 
undetectable by microscopy. Each infection class has a distribution of parasi- 
taemia associated with it, which is used to estimate the sensitivity of various 
diagnostic tests. Each infection class also has an infectiousness associated 
with it based on infectivity data. We assume that treated individuals test pos- 
itive for HRP2 after clearance of asexual parasitaemia for different durations, 
depending on the detection limit of the test. 


Simulated scenarios. We considered two implementation scenarios: an MSAT 
programme in an African setting (Zambia), and targeted MDA in Southeast 
Asia (Cambodia). The former represents a setting in which malaria transmis- 
sion is moderate (15% parasite prevalence measured by microscopy in children 
under 5 years in 2012)*°, whereas the latter is an area of low prevalence (<1% 
parasite prevalence across all ages measured by microscopy in 2010)°°. Three 
diagnostic detection thresholds were considered: 200, 20 and 2 parasites per 
microlitre. The highest level was chosen to reflect the diagnostic threshold for 
microscopy and widely used rapid diagnostic tests, although more modern 
RDTs may have a sensitivity closer to 40 parasites per microlitre. The lower 
thresholds were chosen to reflect log orders of difference, while remaining 
a feasible target product profile for a new diagnostic test based on current 
technologies. A diagnostic threshold of 2 parasites per microlitre is currently 
achieved by LAMP and nested PCR”. 


MSAT in an African setting 

The frequency, coverage and artemisinin-based combination therapy used 
were based on the current operational strategy being implemented in Zam- 
bia. A single seasonality profile was used based on average rainfall patterns 
in Zambia between 2002 and 2009 obtained from the US Climate Predic- 
tion Center?”?. We assumed three rounds of MSAT were conducted 1 month 
apart during the dry season and repeated annually for a maximum of 8 years. 
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Figure 3 | Example simulation of time to interrupt transmission. a, Time series 
plots from the Imperial College model showing the impact of 8 years of 
intervention that consists of three rounds of mass screen and treat (MSAT) or 
mass drug administration (MDA) a month apart during the dry season with 
dihydroartemisinin and piperaquine at a coverage of 80%. The entomological 
inoculation rate is assumed to be 2.5. The blue, yellow and grey lines show the 
impact of MSAT using a diagnostic with a sensitivity of 200, 20 and 2 parasites 
per microlitre. The green line shows the impact of MDA implemented with 

the same timing and coverage as MSAT. The dashed line shows the prevalence 
threshold at which, if prevalence remains below this line for 50 consecutive days, 
we claim that transmission is interrupted. b, Enlarged version of a showing the 
interruption of transmission prevalence threshold. The arrows show the time 

at which interruption of transmission is achieved and the colours of the arrows 
correspond to the prevalence curves. EIR, entomological inoculation rates; PCR, 
polymerase chain reaction. 


All treatments were assumed to be with dihydroartemisinin and piperaquine, 
which has high efficacy in clearing parasites (95%) and a duration of protec- 
tion of approximately 30 days*®. We assumed 80% coverage of individuals at 
each round with no correlation between those receiving the drugs between 
each round, meaning that an individual has a 99% chance of being treated at 
least once per year. The simulations were undertaken for a range of assumed 
baseline transmission levels for Zambia based on entomological inoculation 
rates (EIRs) between 0.5 and 20 infectious bites per person per year with no 
imported infections from other regions . 

The Imperial College model assumes that the proportion of each infected 
compartment that is detectable for a given diagnostic threshold is unchanged 
as prevalence decreases. In this model interruption of transmission is defined 
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Figure 4 | Predictions of interruption of transmission for mass drug 
administration (MDA) and mass screen and treat (MSAT) with three diagnostic 
detection limits. a, Number of annual rounds required to interrupt transmission 
for a range of entomological inoculation rate values between 0.5 and 9.5 
predicted by the Imperial College model. b, The probability of interrupting 
transmission by transmission intensity and diagnostic sensitivity predicted by 
OpenMalaria. We simulated scenarios for each effective transmission intensity 
of 10,000 individuals with no imported infections, no correlation in who receives 
each dose and low case-management. Because OpenMalaria is stochastic, we 
ran 100 simulations at each transmission intensity and present the proportion of 
scenarios in which transmission was interrupted. We chose one point in time to 
represent the results, after four years of MSAT. a and b assume three treatment 
rounds per year 1 month apart in the dry season. 


to have occurred once the PCR-detected prevalence for all age groups has 
been sustained below 0.0149% for 50 consecutive days in the absence of 
importation. This corresponds to a 99% probability that one or fewer people 
are infected in a population of 1,000 people*’. In OpenMalaria, interruption of 
transmission is achieved when there are no infected individuals in a simula- 
tion. We ran 100 OpenMalaria simulations — each with a population size of 
10,000 people — for each scenario to predict the probability of interrupting 
transmission. We did not carry out a formal model comparison. 


MDA in Southeast Asia 

At present, targeted MDA is considered the most promising strategy to stop 
the spread of artemisinin-resistant P. falciparum parasites in the Greater 
Mekong subregion*®. In the Southeast Asian setting, targeting populations 
at the village level is being trialled°. Villages are screened to measure their 
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Figure 5 | The predicted seasonal pattern of prevalence by polymerase chain 
reaction (PCR), microscopy and rapid diagnostic test (RDT) (with detection 
thresholds of 200, 20 and 2 parasites per microlitre) over 1 year in a high- 
prevalence Cambodian setting. The results show the output from the MAEMOD 
model with high-coverage treatment of clinical cases. 


prevalence and then if they are above a given threshold, they are selected for 
MDA. This is performed in the context of enhanced detection and treatment of 
clinical cases and high-coverage vector control. Currently, the diagnostic used 
is PCR. We used the MAEMOD model to investigate how using an RDT with 
a sensitivity of 200, 20 or 2 parasites per microlitre would affect the identifi- 
cation of villages. Simulations were performed for a single-peaked high-trans- 
mission Cambodian setting*’ where a village (with a population of roughly 
1,000 individuals) can have a prevalence as high as 70%. This prevalence 
would correspond to an EIR of 3.5 per year (assuming little or no treatment of 
clinical cases) or 4.5 per year (assuming high coverage of treatment of clinical 
cases) and single-peaked seasonal transmission. The predicted prevalence in 
a village over time with the different diagnostics was used to assess the ability 
of the diagnostic to select a village for targeted MDA. 


RESULTS 

Table 1 summarizes the infection status and infectivity of the 128 participants 
who were sampled once, twice or three times over one annual transmission 
cycle in Burkina Faso. Of the 277 samples, 251 were found to be positive by 
QT-NASBA. Overall, 4% (11) of infections were considered to have clinical 
disease (associated with high parasite density), 42% (106) were considered 
to have asymptomatic detectable infections and 53% (134) were asympto- 
matic and negative under microscopy, but positive by QT-NASBA (referred to 
as asymptomatic undetectable). The remaining 26 samples were negative by 
both QT-NASBA and microscopy. 

Those with clinical disease had the highest gametocyte and total para- 
site densities, whereas those with asymptomatic detectable infection had ga- 
metocyte and total parasite densities higher than those with asymptomatic 
undetectable infection (Table 1, Fig. 1a). The infection compartment has a sig- 
nificant effect on the proportion of individuals that are gametocyte positive (P 
< 0.001). This trend in parasite density and gametocytaemia is mirrored in the 
infectivity to mosquitoes: those with clinical disease infect a higher proportion 
of mosquitoes (23%) than those with microscopically asymptomatic detect- 
able infection (13%) and those with asymptomatic undetectable infection 
(4%) (P < 0.001). In this setting, individuals with asymptomatic undetectable 
(sub-microscopic) infections made up 53% of the population and contributed 
29% of the infectious reservoir of the population. 

Under all three diagnostic thresholds considered, all individuals with clin- 
ical disease would be detected (Fig. 1b, 1c). At a diagnostic threshold of 200 
parasites per microlitre, 74% of the individuals with asymptomatic detecta- 
ble infection would be detected. Increasing the sensitivity of the diagnostic 
to 20 or 2 parasites per microlitre increases this proportion to 88% and 92%, 
respectively. Similarly, with a sensitivity of 200 parasites per microlitre, only 
22% of individuals with asymptomatic undetectable infection would be de- 
tected. This proportion increases to 49% and 71% by increasing the sensitivity 
of the diagnostic to 20 or 2 parasites per microlitre, respectively. Combining 
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these groups and weighting by relative infectivity to mosquitoes, only 55% 
of the infectious reservoir is detected using a threshold of 200 parasites per 
microlitre. We estimate that a new diagnostic with a sensitivity of 20 or 2 par- 
asites per microlitre would be able to detect 83% and 95% of the infectious 
reservoir, respectively. 

Figure 1d shows the proportion of the infectious reservoir detected in the 
Burkina Faso setting as predicted by the OpenMalaria model (the model based 
on parasite densities, albeit calibrated to microscopy rather than QT-NASBA 
densities, and therefore the model from which such predictions can be made). 
These generally compare well to those observed in the data (Fig. 1c,d), but 
with less variability with season than observed, and a slightly smaller contri- 
bution to the infectious reservoir at low parasite densities. The detectability 
of infections in the data was generally lowest at the peak of the transmission 
season, but this pattern of seasonality was not evident in OpenMalaria. 

Figure 2 shows the predicted proportion of infected individuals in each in- 
fection class for a range of transmission intensities assuming no seasonality in 
transmission for the three models. In both the Imperial College and OpenMa- 
laria models, these proportions are predicted to remain reasonably constant 
across a range of transmission intensities, although age-specific predictions 
show a decrease in the proportion of younger children with sub-patent infec- 
tions with transmission intensity and an increase in older adults in both mod- 
els (results not shown). By contrast, the MAEMOD model suggests a higher 
proportion of patent infection at high-transmission intensities. This difference 
is probably due to the data sets used for validation, with the MAEMOD model 
being calibrated for low-transmission settings. In both the Imperial College 
and MAEMOD models, at low transmission, the proportion of individuals with 
clinical malaria, patent infection and sub-patent infection, and the infectivity 
from these groups closely match those observed in the Burkina Faso data (Ta- 
ble 1). The OpenMalaria model suggests a similar proportion of the population 
in the patent and sub-patent categories (assuming a sensitivity of microscopy 
close to 200 parasites per microlitre), but predicts that a greater proportion 
of the infectious reservoir comes from those with patent infection. It also pre- 
dicts a lower proportion of the infectious reservoir from sub-patent infections 
under the assumption of no seasonality. However, OpenMalaria fares reason- 
ably well when the scenario inputs are tailored to the high-transmission, high- 
ly seasonal setting (Fig. 1d). Two of the models (Imperial College and Open- 
Malaria) were used to simulate the impact of MSAT strategies in a Zambian 
setting. Figure 3 shows an example of the time to interruption of transmission 
with a pre-intervention EIR of 2.5 using the Imperial College model. Here the 
model predicts MSAT using a diagnostic test with a detection limit of 200 
parasites per microlitre would not achieve interruption of transmission, even 
after 10 years. Increasing the sensitivity of the diagnostic to 20 or 2 parasites 
per microlitre would achieve interruption of transmission after 10 and 6 years, 
respectively. The Imperial College model predicts that repeated MSAT using a 
diagnostic with a detection limit of 200 parasites per microlitre only interrupts 
transmission in areas with an EIR below 1 (Fig. 4a). Increasing the sensitivity 
to 20 or 2 parasites per microlitre interrupts transmission in areas with EIRs 
up to 2.5 and 4, respectively. Increasing the sensitivity of the diagnostic also 
reduces the number of rounds required to interrupt transmission. A similar 
pattern is predicted by the OpenMalaria model: interruption of transmission 
is rare using a diagnostic with a sensitivity of 200 parasites per microlitre at 
an EIR greater than 1 (Fig. 4b). Increasing the sensitivity to 20 or 2 parasites 
per microlitre increases the probability of interruption at an EIR of 1, and at 
a sensitivity of 2 parasites per microlitre increases the range at which inter- 
ruption could occur up to an EIR of 4. Thus, in a setting with a low EIR with a 
seasonality pattern from Zambia, both an increase to 20 or to 2 parasites per 
microlitre could improve the probability of interruption and expand the range 
of settings in which this could be feasible. However, it should be noted that 
in both models MDA is predicted to be substantially more likely to interrupt 
transmission and potentially to increase the range of transmission intensities 
at which mass treatment could interrupt transmission. 

Figure 5 shows the output from the MAEMOD model applied to a Cambo- 
dian setting. Owing to the persistence of HRP2 following treatment, this mod- 
el reflects the seasonal variability of an RDT in predicting prevalence because 
when treatment levels are high, there will be a higher proportion of HRP2-pos- 
itive individuals with cleared infections. This will have an impact on the ability 
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of the test to detect villages suitable for MDA on a pre-screening. If there are 
high levels of detection and treatment of clinical malaria, the model predicts 
that a diagnostic with a sensitivity of 200, 20 or 2 parasites per microlitre will 
predict a prevalence closer to the prevalence by PCR at the beginning of the 
season (when prevalence in increasing) than at the end of the season (when 
prevalence is decreasing). In this setting a threshold of 20 parasites per micro- 
litre is predicted to be as accurate as PCR at predicting population prevalence 
during the early malaria season. 


DISCUSSION 

The success of test-and-treat strategies to clear the infectious reservoir of 
P. falciparum malaria will depend in part on the extent to which current and 
new diagnostic tools are able to detect individuals who are infectious to 
mosquitoes. Our results from a high-endemicity setting in Burkina Faso sug- 
gest that individuals with sub-microscopic infection contribute 29% to the 
infectious reservoir of the population. Given the close correlation between 
the sensitivity of RDTs and microscopy’ it is likely that a similar proportion 
of the infected population would not be detected by RDTs. Thus, even if 
every person in the population is screened and treated at each round (which 
is operationally very unlikely), just under one-third of the infectious reservoir 
of the population would remain untreated. There is clearly a need to devel- 
op more sensitive diagnostic tools if MSAT strategies are to be successfully 
deployed in the field. 

By stratifying the data by parasite density, we were able to relate parasite 
density thresholds to the proportion of the infectious reservoir captured. At 
a threshold of 200 parasites per microlitre, only 22% of individuals with in- 
fection that is undetectable by microscopy have parasite densities measured 
with QT-NASBA above this threshold. Improving the diagnostic sensitivity 
to 20 or 2 parasites per microlitre increases the proportion of successfully 
identified infected individuals to 49% and 71%, respectively. It should be not- 
ed that these results are specific to the Burkina Faso study site — an area 
with very high and very seasonal transmission. Further research is required to 
understand how the results might be modified at lower transmission, in less 
seasonal settings or in settings with declining transmission. Similarly, possible 
associations of age with the detectability and transmissibility of infections, as 
well as with exposure to mosquitoes, requires further study‘. 

Not all infectious individuals need to be detected and treated to reduce 
the effective reproduction number to below 1. At low endemicity, the effective 
reproduction number (that takes into account the impact of all interventions) 
will be closer to 1 and hence the proportion of infections that need to be de- 
tected and treated will be lower than at higher endemicity. At a given frequen- 
cy of MSAT rounds, there is therefore a critical threshold of infectivity that 
needs to be detected to reduce the reproduction number to below 1. 

Using two different transmission models applied to an African setting 
with seasonal transmission, our results demonstrate that MSAT using a diag- 
nostic threshold of 200 parasites per microlitre could interrupt transmission 
at an EIR of between 0.5 and 1 if operational coverage is high and the inter- 
vention is continued for 8 years. By increasing the sensitivity of the diagnostic 
tenfold this is increased to an EIR of between 0.5 and 2.5, whereas increasing 
the sensitivity 100-fold would expand this to an EIR of between 2 and 4. Large 
areas of southern and eastern Africa have EIR values below 4, indicating that 
repeated MSAT programmes with a high-sensitivity diagnostic test of the 
order of 2 parasites per microlitre could be considered as an intervention in 
these areas**, However, the EIR in Burkina Faso is much higher than 4, sug- 
gesting that MSAT would not be effective even with a highly sensitive diag- 
nostic test (as confirmed in a recent trial’”"®). Equally, using a higher sensitivity 
diagnostic could reduce the number of treatment rounds required to interrupt 
transmission, thus reducing the operational and financial requirements and 
potentially increasing the community acceptance of the intervention. These 
results do have a number of caveats; in particular, as shown elsewhere, there 
are a number of factors (including overall coverage levels, the specific drug 
used, the correlation of coverage between rounds*' the degree of importation 
from other areas”, the size of the programme, and the seasonality, timing and 
coverage of other interventions) that can have a large impact on the success 
of MSAT and MDA programmes. Similar impacts of diagnostic sensitivity on 
the probability of elimination have been found*®. 


$100 


The costs and operational challenges involved in carrying out the testing 
may prove more important than diagnostic sensitivity in determining how 
many infections can be detected and treated*®. MDA may be a better option 
if it is acceptable to the local population because the additional prophylactic 
effect that is obtained by giving the drug to the whole population means that it 
is always theoretically more effective than MSAT. However, conducting multi- 
ple MDA rounds may be unpopular, especially if prevalence and incidence are 
low. We estimate that in the Zambia-like transmission setting described, MSAT 
using a diagnostic with a detection limit of 2 parasites per microlitre at a cov- 
erage of 80% is equally as effective as MDA at a coverage of 65% (results not 
shown). In addition, the wide-scale use of MDA has been anecdotally associat- 
ed with parasite resistance*’, and there are worries that it may also accelerate 
the spread of artemisinin- and piperaquine-resistant malaria parasites. MSAT 
may therefore become the preferred option if transmission cannot be interrupt- 
ed with vector control alone and if diagnostics become sufficiently sensitive to 
detect the minimum fraction of infectious individuals to impact transmission. 

Targeted MDA is being considered in the Greater Mekong subregion. The 
ability to accurately target villages with high numbers of infected individuals 
is crucial. We have used the MAEMOD model to compare the prevalence pre- 
dicted by PCR, microscopy and RDT of three different detection limits. The 
model predicts that a 20 parasites per microlitre detection limit is as accurate 
as PCR at predicting population prevalence during the early malaria season 
and is linked to the presence of HRP2 after successful treatment (which is 
detectable for longer periods of time for lower detection limits). This leads 
to recently treated individuals testing positive by RDT after they have cleared 
their asexual parasites. This could be considered an advantage for measur- 
ing population prevalence for the purposes of triggering for MDA because 
this type of low specificity is an indication of recent infection and is therefore 
linked to the current true prevalence. This aspect can therefore be exploited 
for triggered MDA interventions, even though it would be a disadvantage for 
an MSAT intervention or for clinical malaria diagnosis that leads to the treat- 
ment of uninfected patients. Focal MDA could be more acceptable than MDA 
to local communities because pre-testing confirms that the parasite is present 
in the community. It could also be more attractive to policymakers, especially 
in Southeast Asia, as it would mean fewer courses of antimalarial drugs are 
given to uninfected individuals and more heavily infected communities can be 
identified and targeted. 

QT-NASBA is the most sensitive diagnostic used in the analysis, however, 
it is unable to detect all infections or infectiousness, illustrated by the fact that 
1 out of 1,095 mosquitoes fed on blood from 28 QT-NASBA negative individ- 
uals became infected. This may be due to degradation of nucleic acids that 
disproportionally affect RNA-based diagnostics? or to a limitation in sensitiv- 
ity regardless of sample handling. Gametocytaemic individuals with sub-PCR 
infection were also found in a cross-sectional survey in Tanzania using a new 
ultra-sensitive PCR technique". These findings illustrate that the available di- 
agnostics do not detect all infections that are present in populations and that 
onward malaria transmission is possible (although with a low probability) from 
apparently parasite-negative individuals. Importantly, these sensitive molec- 
ular techniques are currently unfeasible for large-scale malaria interventions. 
Diagnostics in development need to strike a balance between being sensitive 
enough to detect a sufficient proportion of the asymptomatic reservoir, and 
remaining cheap, useable and having a quick turnaround of results. 
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Health-seeking behaviour, diagnostics and 
transmission dynamics in the control of visceral 
leishmaniasis in the Indian subcontinent 


Graham F. Medley’, T. Déirdre Hollingsworth?*: Piero L. Olliaro*® & Emily R. Adams*® 


Countries in the Indian subcontinent have committed to reducing the incidence of kala-azar, a clinical manifestation of visceral 
leishmaniasis, to below 1 in 10,000 by 2020. We address the role of timing of use and accuracy of diagnostics in kala-azar control 
and elimination. We use empirical data on health-seeking behaviour and health-system performance from the Indian state of 
Bihar, Bangladesh and Nepal to parameterize a mathematical model. Diagnosis of cases is key to case management, control and 
surveillance. Treatment of cases prevents onward transmission, and we show that the differences in time to diagnosis in these 
three settings explain the observed differences in incidence. Shortening the time from health-care seeking to diagnosis is likely 
to lead to dramatic reductions in incidence in Bihar, bringing the incidence down to the levels seen in Bangladesh and Nepal. 
The results emphasize the importance of maintaining population and health-system awareness, particularly as transmission and 
disease incidence decline. We explore the possibility of diagnosing patients before the onset of clinical kala-azar (before 14 days 
fever), and show that this could have a marked impact on incidence, even for a moderately sensitive test. However, limited spec- 
ificity (that results in false positives) is a major barrier to such a strategy. Diagnostic tests of high specificity used at an early stage 
of active infection, even if sensitivity is only moderate, could have a key role in the control of kala-azar, and prevent its resurgence 


when paired with the passive health-care system and tests of high sensitivity, such as the test for rK39 antibody response. 
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he protozoan parasite Leishmania is transmitted by the bite of an in- 
fected sand fly. It disproportionately affects the poorest communi- 
ties in endemic countries, and has an associated global mortality of 
200,000-400,000 per year'. An elimination campaign has been running in 
the Indian subcontinent (India, Nepal, Bangladesh, Bhutan and Thailand) since 
2005. Three elimination time frames currently exist; the first, ending in 2015, 
is to establish progress that has been made; the second is the elimination 
of visceral leishmaniasis as a public health problem by 2017 (committed to 
by Indian subcontinent governments and visceral leishmaniasis programme 
managers); the third, as part of the London declaration on neglected tropical 
diseases, is to eliminate visceral leishmaniasis as a public health problem by 
2020 (defined as less than 1 case of kala-azar in 10,000 people in endemic 
areas, at the block (India) and upazila (sub-district; Bangladesh) level in the 
Indian subcontinent). In this Article we use the term kala-azar to define the 
clinical disease and manifestations of the infection caused by visceral leish- 
maniasis. 
Considerable progress has been made towards the target of less than 
1 case in 10,000 by implementing novel case-detection strategies, rapid di- 
agnostic testing and vector control activities. At present, Nepal has achieved 
elimination for two consecutive years and Bangladesh has reached the elim- 
ination targets in all upazilas (World Health Organization (WHO), personal 
communication). However, India has not yet reached these low levels and the 


latest data are more than 1 case per 10,000 people in endemic districts. This 
higher rate of incidence is thought to be due to a combination of differences 
in underlying transmission, pre-elimination campaign endemicity, health sys- 
tems, diagnosis rates and the use and success of vector control programmes’. 
However, the relative contribution of these different factors has yet to be 
quantified and will be a crucial determinant of the success of the expansion 
of control programmes. 

The case-defining conditions for kala-azar in the context of the elimina- 
tion programme in the Indian subcontinent are prolonged fever of more than 
2 weeks, splenomegaly and a positive rK39 test. The rK39 test is an anti- 
body-based detection, immunochromatographic test that has been shown to 
have high sensitivity (around 97%) when combined with clinical symptoms? 
(Table 1). There is, however, an inherent delay in receiving treatment because 
clinical definition of a suspect case of kala-azar requires at least 2 weeks of fe- 
ver. This definition is used partly because of the low specificity of the rK39 test 
to identify infection rather than exposure, and to differentiate kala-azar from 
other fever-causing aetiologies. For the control of kala-azar, it has been pro- 
posed that a diagnostic test that detects active infection rather than the im- 
mune response to infection could be used’. Testing earlier could identify more 
patients and interrupt transmission by early treatment. However, the speci- 
ficity of such a test would have to be high to avoid false positives, as many of 
the patients tested will present with non-specific symptoms. Consequently, 
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Table 1 | Current diagnostic tests with sensitivity and specificity data taken from 
the Indian subcontinent where possible. 


Diagnostic test 


Sensitivity Specificity 


rK39 RDT? 97.0% (95% Cl = 90.0-99.5) 90.2% (95% Cl = 76.1 -97.7) 
DAT** 97.1% (95% Cl = 94.9-98.4) 95.7% (95% Cl = 88.1-98.5) 
Parasitology spleen” >95% 100% 

Parasitology bone marrow” 60-85% 100% 

Antigen (KAtex)”” 68-100% 98% 

BERS 92.3% (95% Cl = 88.4-94.9) 63.3 (95% Cl = 53.9-71.8) 


Cl, confidence interval; DAT, direct agglutination test; PCR, polymerase chain reaction; RDT, rapid 
diagnostic test 


the target product profile (TPP) of a diagnostic test that could identify less 
severe or even asymptomatic cases has not yet been defined. We considered 
the effect of such an intervention and the role of the specificity. 

Within villages, cases of kala-azar tend to be clustered around index 
cases, suggesting that the drivers of local epidemics are individuals with ka- 
la-azar*®. Most infected people are asymptomatic, but may still contribute to 
transmission at low levels’. Patients with post-kala-azar dermal leishmaniasis 
(PKDL) are known to harbour parasites, often in the dermal region, and are as- 
sumed to be infectious to sand flies®. The relative role of people with kala-azar, 
asymptomatic individuals and people with PKDL in sustaining transmission 
has not been measured directly and is unknown®*. Some of this uncertainty 
will hopefully be addressed in the coming years, but at present more indirect 
methods are required to understand the dynamics of transmission and con- 
trol. If kala-azar cases contribute to most of transmission, then early diagnosis 
and treatment is likely to be a highly effective intervention. 

Very few modelling studies have been undertaken to help disentangle the 
interactions between individual, population and system processes for visceral 
leishmaniasis. This is partly because of a lack of quantitative information on 
the natural history of the disease". The approach we present is to use a single 
model to compare information from three different endemic areas. From this 
we can infer the rates of progression through different clinical states, largely 
based on the fact that health seeking and health care have different outcomes 
in these areas. We then extend this core health-seeking model to account for 
transmission, using a simple, parsimonious framework to investigate the im- 
pact of changes in diagnosis on transmission dynamics. 

We consider two interventions: reducing diagnostic delays in individuals 
who already fulfil the kala-azar definition, and introducing novel diagnostics to 
enable diagnosis of those who do not, or are yet to, fulfil the clinical definition 
of 14 days of fever and splenomegaly. We consider the dynamic consequences 
of these interventions, and highlight the potential for rebound epidemics as 
population (herd) immunity is curtailed. Finally, we consider the profile of a 
diagnostic required for diagnosis and treatment prior to full kala-azar, and em- 
phasise that specificity, rather than sensitivity, is the limiting factor. 


METHODS 

Empirical data. The data that inform the model are from studies on self-report- 
ed time from symptoms to health seeking and eventual diagnosis of kala-azar 
(Fig. 1a). In Nepal and Bihar, 92 patients with kala-azar who had experienced 
103 kala-azar episodes were interviewed. Patients waited for 30 days (95% 
confidence interval (Cl) = 18-42) in Nepal before seeking health care, 3.75 
times longer than in Bihar where patients waited 8 days (95% Cl = 4-12). Con- 
versely, the lag time from seeking health care to receiving a kala-azar diagnosis 
was 90 days (95% Cl = 68-113) in Bihar compared with 25 days (95% Cl = 
13-38) in Nepal. The time span between diagnosis and treatment was short in 
both countries. In Bangladesh, a 2007 cross-sectional study in Godagari Up- 
zila, Rajshahi, Bangladesh by the International Centre for Diarrhoeal Disease 
Research, of the knowledge of, attitude to, and practice surrounding kala-azar 
and its treatment by communities and health providers, also screened for ka- 
la-azar by rK39 dipstick test individuals who had had fever for more than 2 
weeks. Around 5,000 households were surveyed, of these, 500 randomly se- 
lected household heads were interviewed, and indicated that it took 4 days 
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Figure 1 | Data and model on delays in diagnosis. a, Time from onset of 
symptoms (defined as fever) to health-seeking and then diagnosis, for Bihar, 
Nepal’ and Bangladesh (D. Mondal, personal communication). b, Flow diagram 
of model. Open boxes show the behaviour model with progression post- 
infection through fever and full kala-azar (KA) (vertical flow), and from non- 
health-seeking to health-seeking (horizontal flow) bevaviour. The shaded boxes 
and grey arrows indicate the extra states required for the transmission model. 


for people to seek medical help after onset of fever and 54 days until a correct 
diagnosis. The cumulative effect of these delays means that in Bihar patients 
are diagnosed 98 days after symptoms start, whereas in Bangladesh it is 58 
days and in Nepal it is 55 days. 


Mathematical models. We initially developed a model without transmission 
that mirrors the available data on the retrospective cohort of patients who have 
been diagnosed with kala-azar (Fig. 1b). The model includes two basic transi- 
tions: progression of disease from the point of developing fever to diagnosis, and 
progression to health-seeking behaviour, giving four possible states: non-health- 
seeking fever (F_), health-seeking fever (F,), non-health-seeking kala-azar (K,) 
and health-seeking kala-azar (K,). Note that we are using ‘fever’ to denote 
non-specific symptoms (the patient recognises the start of the illness that leads 
to a kala-azar diagnosis, but the symptoms would have been insufficiently spe- 
cific for diagnosis of kala-azar at that time). In the model, given a passive health- 
care system and the absence of better diagnostics, only individuals who are 
health seeking and have clinical kala-azar can be diagnosed. We include a single 
parameter for disease progression (transitions from F to K, duration denoted 
1/a, where ais the rate of progression), thereby assuming that kala-azar does 
not have different pathology in different countries. We include three parameters 
that are determined by the health-seeking and diagnostic patterns and are lo- 
cation specific — two parameters for the onset of health seeking (F_ to F,, and 
K, to K, rates, denoted b and c, respectively) and one parameter for diagnosis 
(K, to D, denoted d). The equations are given in the Supplementary Information. 

The model was parameterized to each locale. Each locale has two obser- 
vations: the time from onset to health-seeking behaviour (entry into F, or K,), 
and the time from health seeking to diagnosis (K, to D). These periods were 
expressed as functions of the four model parameters. Unique values of the 
parameters cannot be estimated, so we generated parameter sets of a, b, c 
and d that reproduce the observed times. We assumed that individuals with 
full kala-azar are more likely to seek health care than those with non-specific 
symptoms (c = b). A grid of all possible integer values for durations in each 
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Figure 2 | Expected time in each stage of the model in each setting. The 
estimated duration in each stage for the combined parameter sets from each 
locale. From left to right, F_, the expected time before health seeking or kala- 
azar (duration of time in fever, non-health-seeking state); Fs time spent health- 
seeking with fever before kala-azar development; K_, duration spent with kala- 
azar before health seeking; K,, duration spent health seeking with kala-azar; 
total, duration spent between onset and diagnosis. Within each column, the 
localities are Bangladesh (red), Nepal (blue) and Bihar (green). Note that there 
are correlations within the parameter sets so that the total time from onset to 
diagnosis is constant (final column). The violin plots indicate the variability in 
the parameter values, and the crosses mark the mean. 


state was compared with the observed outcomes. Parameter combinations 
that reproduce the observed times were retained, thus generating locale-spe- 
cific combinations of parameters that produced results consistent with the 
data. The values of the parameters within the sets were correlated. For exam- 
ple, to reproduce the observed time to diagnosis, a shorter period spent health 
seeking is combined with a longer time waiting for diagnosis. Therefore, the 
parameters were retained as groups of appropriate parameters rather than 
ranges of possible values. 

The model was then extended to include transmission dynamics (Fig. 1b). 
The extended model includes an incidence of infection and a latent class to 
account for delays between infection and onset of symptoms. Most latently 
infected individuals recover directly to the dormant stage without progress- 
ing to symptoms leading to kala-azar". The instantaneous rate of infection is 
calculated as the weighted sum of the force of infection from the various po- 
tentially infectious states: latent, fever, kala-azar and dormant. The equations, 
including analytical solution for the basic reproduction number, R,, and equi- 
librium state, are given in the Supplementary Information. The basic reproduc- 
tion number, R,, is defined as the average number of onward infections caused 
by a single infectious individual in a wholly susceptible population throughout 
the infectious period of that individual. R, is a combination of the number of 
infections generated in each infection stage, which in turn is determined by 
the duration and infectiousness of the stage. For an endemic disease such as 
visceral leishmaniasis, populations are not wholly susceptible, and the number 
of onward transmissions will be reduced by potential contacts that are already 
infected or immune. 

The duration of the latent stage, relative infectiousness of different stages 
and R, cannot be measured with current diagnostics and available epidemio- 
logical data. Consequently, we chose values that are consistent with current 
understanding”, that reproduce patterns consistent with general observa- 
tions’, and so that the equilibrium state of the model in the three settings 
was consistent with relative epidemiological patterns. All of the biological 
parameters were the same across all three settings, so differences between 
settings were due to the differences in health-seeking behaviour and health- 
care response (see Supplementary Information). 
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Figure 3 | Impact of diagnosis delay on transmission. Sensitivity analysis of 

the effect of different overall transmissibility, By (columns), and the relative 
infectiousness of individuals with non-specific symptoms, 3, (rows). Within 

each panel the expected equilibrium incidence (cases per 10,000 per year) is 
plotted against the basic reproduction number, R, (dotted line). Each green point 
corresponds to one of the health-seeking, diagnosis and progression parameter 
sets for Bihar; blue are for Nepal. The points for Bangladesh are masked by those 
for Nepal. 


We then used the model to examine two potential interventions. First, 
we considered the impact of reducing the time to diagnosis in Bihar to that 
estimated for Nepal and Bangladesh. Second, we examined the impact of a 
diagnostic and treatment intervention applied during the pre-kala-azar fe- 
ver stage. We show results both for the average incidence at equilibrium, 
which may take many years to be reached, and for the average number of 
diagnoses over the first 5 years from introduction, to demonstrate the short- 
er-term effects. The impact of a diagnostic test is determined by both the 
sensitivity and specificity of each test, and how it is applied, for example a 
single test per patient or multiple testing. We include a testing rate, T, so that 
the average interval between tests is 1/T, and the proportion of individuals 
correctly diagnosed before the onset of kala-azar is TS, / (TS, + a), where S, 
is the per-test sensitivity. To account for specificity, we assume that there is 
a background rate of 200 additional cases of fever owing to other infections 
and not related to visceral leishmaniasis, and calculate the rate at which 
false positives arise as a function of the specificity per test, S,, and the rate 
of testing. We present equilibrium results, and the numbers of true and false 
positives that are expected to arise over 5 years that follow the introduction 
of such an intervention. 


RESULTS 

Estimated pathways to diagnosis. The parameter sets that are consistent with 
the data are shown in Figure 2. There are 526 unique parameter combinations 
(a, b, c, d) for Bihar, 2,320 for Nepal and 312 for Bangladesh. The time between 
fever onset and progression to clinical kala-azar is 33 days < a < 55 days. In 
Nepal, individuals develop the disease faster than they seek health care, so that 
the typical path for an individual is fever to kala-azar to health seeking to diag- 
nosis, with most individuals diagnosed and treated within the first few weeks of 
symptoms. In Bihar and Bangladesh, health seeking starts much earlier, but di- 
agnosis is slower so that individuals enter the health system, but then remain in 
the symptomatic state without treatment. The major difference between Nepal 
and the other two regions is that kala-azar cases in Nepal most frequently first 
present to health-care services as clinical kala-azar, whereas in Bihar and Bang- 
ladesh people first present with non-specific symptoms. Consequently, in Bi- 
har, patients are likely to be diagnosed and treated for more common infections 
(such as bacterial infection) owing to presentation of non-specific symptoms. 
There is a risk that treatment failure, rather than misdiagnosis, is blamed, lead- 
ing to repeat treatments for the wrong diagnosis. In addition, many patients 
present to unqualified practitioners or the private health-care system, delaying 
correct diagnosis". 
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Figure 4 | The impact of interventions on transmission dynamics. a, The impact 
of reducing the time to diagnosis in Bihar to 10 days plus 14 days fever; the 
green bar indicates the duration of intervention. There is an initial increase 

in diagnoses of kala-azar (KA, black lines), which then leads to a reduction in 
underlying incidence and incidence of kala-azar. The intervention is lifted at 5 
years, leading to a resurgence of cases and a large outbreak several years later. 
The different lines are for different parameter sets. b, Alternatively, the impact of 
introducing a diagnostic for febrile cases such that 30% of cases do not progress 
to kala-azar. This increases the number of diagnoses during the fever stage 

(red line) and decreases the number diagnosed with kala-azar (blue); the total 
diagnoses are shown in black. In this case, as the intervention does not lead to 
elimination, an epidemic occurs while the intervention is in place (green bar). 
When the intervention is lifted there is a further resurgence of cases. 


Transmission dynamics. As with all transmission models, incidence of diagno- 
sis increases non-linearly with the basic reproduction number, R,, for all values 
of the transmission parameters (Fig. 3). At equilibrium, as long as individuals 
with kala-azar provide most infection to vectors, the transmission potential is 
higher in Bihar than Nepal (the average number of onward transmissions per 
infected individual, R,, will be different). Only when transmission rates are high 
(Fig. 3), and therefore the number of onward infections, R,, is also high, are the 
incidences of disease in all settings similar. At this level of transmission, the di- 
agnostic differences are masked by the high infectiousness of latent, fever and 
dormant cases and variation between parameter sets and settings is obliterated. 
In reality, the transmission dynamics are not at equilibrium, and incidences are 
reducing in all locations due, at least in part, to reductions in R,. Consequently, 
differences in incidence between settings can be explained by the differences in 
diagnostic delays that are consistent with the observed durations (Fig. 2). This 
suggests that variability in health-care seeking and diagnostic delays between 
different settings, and the resulting distribution of times with non-specific and 
specific symptoms are likely to have an impact on transmission patterns, par- 
ticularly if there is differential infectivity at these different stages of infection". 


Dynamic consequences of interventions. Owing to a lack of further informa- 
tion, we set the relative transmission parameters to be those in the central 
panel of Figure 3 (see Table 1 for parameter values). At these values, the equi- 
librium for Nepal and Bangladesh is low, and elimination has been achieved for 
some parameter sets, and the equilibrium for Bihar is between 4 and 5 cases 
per 10,000 people per year. 

We consider two interventions that shorten the period of high infectious- 
ness and transmission potential. The impact of these interventions on the dy- 
namics of diagnoses is similar (Fig. 4). First, we switched the delay in diagno- 
sis seen in Bihar (43-63 days) to the average delay in Nepal (10 days, added to 
the 14 days of fever to become a clinical suspect; Fig. 4a). A reduction in time 
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Figure 5 | The impact of a diagnostic among those with non-specific symptoms 
(fever). a, The equilibrium incidence of diagnosed cases (achieved after many 
decades) with different proportions diagnosed before full kala-azar; total 
incidence of diagnosed cases (black), those that are diagnosed with less specific 
symptoms (fever, red) and full kala-azar (blue). A 75% reduction is enough for 
infection elimination, 55% is enough for elimination as a public health problem 
(<1 case per 10,000 per year). b, The cumulative incidence over a shorter time 
scale of 5 years with different proportions diagnosed before kala-azar. Note 

the diminishing returns — halving the total number of cases of kala-azar can be 
achieved by preventing only 25% of cases from progressing. 


to diagnosis shows an initial large peak in cases, as the cases that are ‘waiting’ 
to be diagnosed are found. This leads to a rapid reduction in transmission, 
which then leads to a decline in incidence. For a setting with an incidence of 
5 per 10,000 people per year, elimination would be achieved with this change 
in the current model. Whether or not elimination would be achieved in reality 
depends on, among other things, the details of transmission from asympto- 
matic infections and spatial heterogeneities. If diagnostic delay is returned to 
its previous length after 4 years, then there is a rebound epidemic, but this is 
much slower and occurs over several years. Note that the predicted patterns 
are similar for all the parameter sets fitted to the Bihar situation. 

Second, we introduce the diagnosis and treatment of patients while they 
are health seeking with non-specific febrile symptoms, before they develop 
kala-azar (before they have passed 14 days of fever and have splenomegaly; 
Fig. 4b). The dynamic response is similar to the dynamics of reducing diagnos- 
ic delays for those who have kala-azar, but without the immediate diagnos- 
ic spike. The modelled intervention (30% of kala-azar cases are diagnosed 
during non-specific fever) is insufficient to eliminate infection. Consequently, 
here is an epidemic while the intervention is in place, owing to the build-up 
of individuals with increased susceptibility to infection who were previously 
protected by the reduction in transmission. If the intervention is kept in place, 
hen incidence eventually returns to a low level. When the intervention is re- 
moved, the second epidemic is a consequence of the increase in transmission 
owing to the longer infectious period when screening of fever cases is stopped. 

In both interventions, the supply of full clinical kala-azar diagnoses is cur- 
tailed, transmission is reduced and there is a reduction of population (herd) 
immunity that leads to a bounce back of cases if the intervention is stopped. 
The speed at which the subsequent epidemic occurs is dependent on the 
success of the intervention — better curtailment of transmission results in a 
longer period to the next epidemic. The slow build-up means that there will 
be no obvious link between lengthening diagnosis delay and incidence, with 
clear implications for monitoring efficiency of diagnosis and treatment. In 
particular, if this intervention were implemented, then potentially there would 
be a reduction in clinical awareness of visceral leishmaniasis, resulting in a 
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Figure 6 | The interaction between sensitivity and specificity. a, The equilibrium incidence of kala-azar cases as a function of the sensitivity per test and the testing 
interval. Elimination is only achieved for frequently applied tests (under 15 days), but can be achieved with relatively low sensitivity. b, As in a but the cumulative 
incidence is plotted over 5 years from the start of the intervention. This tends to reduce the importance of frequent testing for highly sensitive tests. c, The 
equilibrium incidence of false positives, assuming 200 non-visceral leishmaniasis fever cases as a function of testing interval and specificity. The relationship with 
specificity is linear. Note the juxtaposition of the highest numbers of false positives (short testing interval, low specificity) with the lowest numbers of true positives 
(short testing interval, high sensitivity in a). d, The proportion of positives that are true-positive positive predictive values, calculated as the average over 5 years 
after introduction of testing, as a function of sensitivity and specificity. The testing interval is fixed at 14 days (people seeking health care provide an opportunity for 
testing once every 2 weeks on average), and there are 200 people with non-visceral leishmaniasis fever who are health seeking. The shape is dominated by specificity 


owing to the numerical scales of b and c. 


lengthening of diagnostic delays. We suggest that although reducing diagno- 
sis delay through special efforts is likely to be an effective means to short-term 
reduction in cases, it is, depending on the epidemiological setting (baseline 
prevalence and biting rates), unlikely to be a sustainable route to long-term 
elimination. The quantitative details depend on specific parameter values and 
model structure, but this general pattern will be observed if those with ka- 
la-azar contribute most of the infection to vectors and onward transmission, 
and if there is sufficient (concomitant) immunity to kala-azar. 


Impact of sensitivity of a novel diagnostic. We also consider the consequenc- 
es of differing diagnostic sensitivities for diagnosis during the non-specific 
symptom phase in terms of its impact on equilibrium and short-term incidence 
of diagnoses (Fig. 5). Increased sensitivity will result in a larger proportion of 
cases being diagnosed earlier. At equilibrium, as the proportion of cases diag- 
nosed increases, the incidence of diagnosis during the fever stage increases, 
but total diagnoses fall owing to reduced transmission (Fig. 5a). Elimination 
of transmission is possible even with relatively moderate sensitivity. If most 
transmission is from kala-azar cases, then halving the numbers progressing to 
this state will halve R,. However, achieving equilibrium incidence in such mod- 
els takes many decades and is mainly of theoretical interest. Consequently, we 
also consider the impact on the numbers of cases expected over the 5 years 
that follow the introduction of the intervention (Fig. 5b). Clearly, the dramatic 
fall in cases (Fig. 4b) occurs for low values of sensitivity, and over the short- 
term there is little extra to be gained from diagnosing more than 25-30% of 
cases before kala-azar. 


Profile of the diagnostic. We profiled the diagnostic required for early case 
detection in terms of both sensitivity and specificity (Fig. 6). The size of the 
false-positive problem is determined by the frequency of fever cases (visceral 
leishmaniasis and non-visceral leishmaniasis) being tested for kala-azar, the 
specificity of the test and the proportion of fever cases that are due to viscer- 
al leishmaniasis. We, therefore, modelled a scenario in which there is a back- 
ground rate of non-visceral leishmaniasis cases that present for diagnosis, and, 
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to account for the possibility of multiple testing owing to the long duration of 
the fever stage for kala-azar (a known phenomenon"), we include a frequen- 
cy of testing for health-seeking kala-azar cases with fever. If each person who 
seeks health care presents once, then the proportion of true cases diagnosed is 
the sensitivity (Fig. 5). However, if people are tested at multiple consultations, 
then there are multiple opportunities for a correct, true-positive result, and a 
lack in sensitivity can be overcome by more frequent testing (Fig. 6a,b). How- 
ever, as the number of testing occasions increases, so the rate at which false 
positives are found increases, and increasing specificity linearly decreases the 
false positives (Fig. 6c). Holding the test frequency constant, the average pos- 
itive predictive value (proportion of positives that are true positives) is shown 
in Fig. 6d as a function of sensitivity and specificity. There are likely to be many 
hundreds to thousands of false positives for every true positive identified un- 
less the diagnostic, or diagnostic combination, is more than 99% specific. This 
problem is amplified by a decreasing prevalence of true infection, as witnessed 
in any elimination setting. 


DISCUSSION 

Our principal conclusion is that earlier diagnosis and prompt therapy have the 
potential to reduce ongoing transmission to elimination or near-elimination 
levels. This is indeed one of the pillars of the current elimination campaign, 
and, although this is likely to have already affected transmission in Nepal and 
Bangladesh, there is a large potential gain in Bihar, given that the diagnostic 
delays in this area are longer. We have also highlighted that curtailing trans- 
mission is likely to decrease population immunity in the long-term, so that 
there is a potential for large epidemics if vigilance is not maintained and diag- 
nostic delays are allowed to increase. If diagnostic delays lengthen either the 
stage of health-care seeking or the ability of the system to recognise kala-azar, 
then subsequent epidemics are predicted to have a long lead time, and will 
not be immediately recognized. We have shown that the introduction of novel 
diagnostics on non-specific fever cases (before full kala-azar) can be effective 
even if sensitivity is relatively low, but that their introduction is prevented at 
present by less than ideal specificity, given the issues of delivery, safety and 
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cost of treatments. The paucity of data available to fit more complex models 
means that our results rely as much on understanding of infectious-disease 
epidemiology as on our simple model. Nonetheless, we believe that these con- 
clusions are supported by current understanding, and, if nothing else, are valid, 
strong hypotheses. 

We studied two mechanisms by which earlier diagnosis could occur in high- 
ly endemic settings, such as Bihar. The first is improving the health system to 
reduce delays in treatment. At present, differences in the time to diagnosis be- 
tween countries are assumed to be because of differing time spent in the private 
health-care system where knowledge of kala-azar is relatively poor. Potentially, 
patients present with less-specific symptoms and enter a different diagnostic 
algorithm, and kala-azar is only suspected later. A study in Bihar using accred- 
ited social health activists (community health workers in India) to identify and 
refer suspected kala-azar cases showed that the time from onset of fever to 
seeking treatment and diagnosis at peripheral health facilities could be reduced 
to 32-50 days in total'®. Our modelling reinforces the observation that reducing 
the time to diagnosis is an effective intervention, but as prevalence decreases 
it will be more difficult to maintain the necessary knowledge and infrastructure 
to sustain this. Intriguingly, the efficiency of the diagnostic process may have 
a natural equilibrium, depending on the frequency of diagnosis. We have also 
shown that resurgence in cases will occur after several years if transmission is 
reduced and not halted, and that the peak of the resurgence may well be higher 
than pre-intervention. Typically, local kala-azar outbreaks occur regularly — 3 
peaks have occured in India 14-15 years apart over the past 40 years”. 

The second mechanism to reduce the time to diagnosis in highly endem- 
ic areas such as Bihar is to identify active infection before kala-azar onset. 
Specificity becomes important when designing a test that targets early case 
detection (before 14 days of fever). Only a test with very high specificity will 
allow patients to be treated, given the limited range of treatments available 
(toxicity, cost, administration or adherence, depending on the treatment). 
Nonetheless, if specificity is high enough to consider treating patients before 
they have 14 days of fever and become a clinical suspect, then patients could 
be tested and treated much earlier than is possible with the current diagnos- 
tic algorithm. This would eliminate a pool of patients that transmit visceral 
leishmaniasis to their local community, thereby substantially reducing future 
cases. Our analysis demonstrates that sensitivity of early testing for visceral 
leishmaniasis is not the main problem. Even a moderately sensitive test (30%) 
can have dramatic affect on kala-azar transmission alongside the current test- 
ing algorithm of passive surveillance with rK39. The challenge to testing earli- 
er in the course of the disease, with the intention of treating, is avoiding a large 
number of false positives. However, as prevalence decreases, the positive pre- 
dictive value of all tests will fall as more patients are false positives than true 
positives, but the resultant gain in sustainable elimination (if such a test were 
to be implemented) is significant. 


Limitations of this model and data gaps 

There are few well-conducted studies to describe the natural history of viscer- 
al leishmaniasis, which is multifactorial’®, and hence the risks for an infected 
individual to become diseased are not quantified. Owing to the localized epi- 
demics that are seen surrounding index cases it is reasonable to assume that 
patients with kala-azar are the main reservoir of infection for sand flies. It is also 
thought that that those with PKDL and asymptomatic individuals or those with 
a dormant infection can also be infectious to sand flies, but the evidence base 
for this relies on anecdotal studies. Only one experimental study® has been 
published whereby sand flies were allowed to feed on four people with PKDL. 
Of the 400 fed flies, 104 became infected. In our analysis we have assumed 
that asymptomatic individuals and those with PKDL are, on average across 
the population, considerably less infectious than symptomatic cases®”. Should 
asymptomatic individuals be more infectious than anticipated, the relative im- 
portance of kala-azar cases to R, will decrease. This means that the period of 
infectiousness will increase, reducing the probability of elimination and neces- 
sitating lower overall transmission rates to maintain the levels of incidence in 
the 1-5 cases per 10,000 people per year. If people with kala-azar are less infec- 
tious than we have assumed, the impact of reduction in diagnostic delays will 
be reduced, but the effect of early, novel diagnostics will be enhanced (results 
not shown). Of course, diagnostics for asymptomatic cases and people with 
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PKDL may be essential for elimination if these groups are shown to be involved 
in transmission of disease and maintain long-term infections. 

Unfortunately, without more detailed data, the models we have described 
are only able to highlight the qualitative patterns, although they are robust 
across a large range of biologically plausible parameters. The health-seeking 
behaviour data are informative about ranges of parameters, including the du- 
ration of non-specific symptoms, a parameter that has not been previously 
estimated. The data would be greatly improved by systematically sampling a 
population in which incidence was also estimated. 

In addition to the uncertainties on the human side of transmission, we 
also know little of sand-fly behaviour, life expectancy and range. Clearly, these 
vector dynamics will have an important role in the transmission cycle and in 
the design of effective interventions against transmission. 


Diagnostics of the future 

Diagnostics play a crucial part in the control and elimination of kala-azar and 
are a research priority’. We argue that one avenue to revolutionize the cur- 
rent control algorithm is to develop a highly specific test of active infection 
(whether symptomatic or asymptomatic), even if limited in sensitivity. There 
is currently no suitable antigen test capable of detecting active kala-azar. The 
only commercially available product, KAtex, has challenges in utility and sub- 
optimal sensitivity, although we would argue that low sensitivity should not 
necessarily be considered a barrier to implementation. Simplified molecular 
diagnostic tools that can be adapted for field situations are under develop- 
ment, including loop-mediated isothermal amplification*®. These tests, along 
with standard polymerase chain reaction, are able to detect circulating DNA 
in the blood of individuals who are actively infected. Studies show that molec- 
ular tests are very sensitive, and although they have low specificity, it may be 
that these are infections that are below the limit of detection of the reference 
standard, and therefore in fact true positives’. Diagnostics that are able to de- 
tect asymptomatic infection and PKDL may also be important, depending on 
the relative role of transmission in these groups. 

Given the uncertainty in clinical progression of kala-azar and diagnostic 
performance, there is clearly a chance that some, possibly much, of the mor- 
bidity and mortality caused by Leishmania infection is being misclassified. The 
process of diagnosis of kala-azar provides both the clinical information for 
treatment, as well as the data on which surveillance is built. Consequently, it is 
inevitable that 100% of cases of kala-azar are diagnosed and treated, regard- 
less of the performance of the health-care system or the health-seeking be- 
haviour of the population. Ideally, surveillance would include information that 
is independent of clinical diagnosis. This is particularly needed as kala-azar 
becomes rarer, and it is likely that both clinical and patient awareness wanes. 
A diagnostic tool that enables population surveillance of infection and disease, 
independent of clinical diagnosis, is a crucial step in achieving, enforcing and 
demonstrating elimination. 
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Ebola emerged in West Africa around December 2013 and swept through Guinea, Sierra Leone and Liberia, giving rise to 27,748 
confirmed, probable and suspected cases reported by 29 July 2015. Case diagnoses during the epidemic have relied on polymer- 
ase chain reaction-based tests. Owing to limited laboratory capacity and local transport infrastructure, the delays from sample 
collection to test results being available have often been 2 days or more. Point-of-care rapid diagnostic tests offer the potential 
to substantially reduce these delays. We review Ebola rapid diagnostic tests approved by the World Health Organization and 
those currently in development. Such rapid diagnostic tests could allow early triaging of patients, thereby reducing the potential 
for nosocomial transmission. In addition, despite the lower test accuracy, rapid diagnostic test-based diagnosis may be benefi- 
cial in some contexts because of the reduced time spent by uninfected individuals in health-care settings where they may be at 
increased risk of infection; this also frees up hospital beds. We use mathematical modelling to explore the potential benefits of 
diagnostic testing strategies involving rapid diagnostic tests alone and in combination with polymerase chain reaction testing. 
Our analysis indicates that the use of rapid diagnostic tests with sensitivity and specificity comparable with those currently under 
development always enhances control, whether evaluated at a health-care-unit or population level. If such tests had been avail- 
able throughout the recent epidemic, we estimate, for Sierra Leone, that their use in combination with confirmatory polymerase 


chain-reaction testing might have reduced the scale of the epidemic by over a third. 
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he unprecedented scale of the 2014-15 West African Ebola epidemic 

has posed major challenges for delivering rapid diagnosis — an essen- 

tial component of controlling Ebola epidemics, given the non-specif- 
ic nature of early clinical symptoms. Following World Health Organization 
(WHO) guidelines, testing has relied on reverse transcription polymerase 
chain reaction (RT-PCR) based methods, which detect viral RNA in serum or 
plasma’. However, these tests are slow and costly, taking 2-6 hours to pro- 
cess* at around US$100 per test. Testing requires high levels of biosafety, and 
both samples and reagents must be kept cold or frozen'**. Sustaining a cold- 
chain has been particularly challenging in the affected countries because of 
the limited infrastructure, frequent power cuts and hot climate. In addition, 
collecting venous blood for PCR testing requires specific medical training and 
poses significant risks for health-care workers’. 

Furthermore, although the laboratory processing time for PCR testing can 
be under 6 hours, the time between sample collection and receiving the result 
has often been much longer during the epidemic, owing to limited laboratory 
capacity and logistical infrastructure®. This has been a crucial issue, because 
delays in testing lead to longer hospital stays for patients, increasing both 
bed demand and the likelihood of nosocomial transmission (a major issue 
in Ebola outbreaks’). At the peak of the epidemic in Sierra Leone (between 
October and November 2014) there were reports of delays in test results of 
up to 1 week"®. Only in January 2015, following an effort to reduce delays in 


laboratory access", did the WHO report substantially reduced average waiting 
times for test results of between 1.3 and 2.3 days, depending on the country”. 

Recognizing the extraordinary circumstances of the epidemic, in Novem- 
ber 2014 WHO issued a call for “rapid, sensitive, safe and simple Ebola di- 
agnostic tests”*. An ideal rapid diagnostic test (RDT) would require minimal 
laboratory facilities and staff training, no cold chain and could be performed 
with a capillary blood sample collected through a finger prick. Such an RDT 
could be used at the point of care to allow faster triaging of people suspected 
of having Ebola, returning a test result in minutes rather than days. 

As a result of the WHO initiative, four tests have already been approved, 
including the benchmark PCR test, Realstar’?, and an RDT, REEBOV"™*"* (Ta- 
ble 1). In addition, the US Food and Drug Administration has authorized ten 
tests for emergency use” and multiple alternative RDTs are in development 
(Supplementary Table 1). 

Although recognizing that rapid and accurate diagnostics are crucial to 
successful containment of an Ebola outbreak, determining the best strategy 
and the impact of RDTs is not straightforward (Fig. 1). Poor diagnostic speci- 
ficity risks patients without Ebola infection being admitted to Ebola treatment 
units (ETUs) — where they potentially acquire infection, and use a bed that 
could otherwise be used by a patient who does have Ebola. Conversely, poor 
diagnostic sensitivity can lead to infected individuals being discharged back to 
the community or sent to non-Ebola-specific wards, with the consequent risk 
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75724 Paris Cedex 15, France. Correspondence should be addressed to N. M. F. e-mail: neil.ferguson@imperial.ac.uk. 
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Table 1 | Characteristics of diagnostic tests that have been approved by the World Health Organization. 


Test name Detected 


Sensitivity 


Specificity 


95% limit of 


Time to result Principal logistic challenge 


detection 


RealStar Filovirus*"**  Ebola-specific RNA NR NR 1,390 RNA cps Hours Kit is shipped on dry ice and NR 
ml (95% Cl = should arrive frozen and be kept at 
690-5,320) -20 °C; needs equipment, including 
an appropriate PCR machine; needs 
special training; and needs high safety 
level in the laboratory 
ReEBOV Antigen Ebola virus (EBOV) 91.8% (95% Cl = 84.6% (95% Cl 6.25E+02 ng mi* 15 minutes Requires refrigeration at 2-8 °C, NR 
Rapid Test*7°* VP40 antigen 84.5-96.8) compared = 78.8-89.4) Ebola rVP40 antigen requires visual interpretation, should be 
with RealStar!® compared to used in biosafety level 4 facility or with 
RealStar’® full PPE and can use whole blood from 
finger prick or venipuncture 
Xpert?627* Ebola-specific RNA For Ebola Mayinga For Ebola Mayinga 232.4 RNA cps ml? 90 minutes Requires refrigeration at 2-8 °C, US$19.80 
RNA: 100%, (95% Cl= RNA: 100.0%, (95% (95% Cl = 163.1— optimally should be used in a class Il per cartridge 
92.9-100.0) (PPA) Cl = 92.9-100.0) 301.6) safety cabinet or similar, needs special 
(NPA) training, automated process and 
requires a minimum of 100 pl whole 
blood by venipuncture 
LifeRiver** Ebola-specific RNA 1 log10 lower limit NR 23.9 RNA cps per Results in 2 hours, Reagents must be kept at—20°C;needs NR 


of detection than 
RealStar 


reaction (95% Cl = 
13.4-405.9) 


total processing 4-6 
hours 


equipment, including an appropriate 
PCR machine; needs special training; 
needs high safety level in the laboratory 


*Test is also US Food and Drug Administration (FDA) authorized. All sensitivity, specificity and limits of detection are reported for Zaire EBOV unless otherwise stated. Cps, copies; NPA, negative per cent 
agreement; NR, not reported; PCR, polymerase chain reaction; PPA, positive per cent agreement; PPE, personal protective equipment. 


of onward transmission. In this Review, we use mathematical models of Ebola 
transmission to explore how RDTs might best be used in future outbreaks, bal- 
ancing the rapid availability of RDT results against the lower sensitivity and 
specificity of such tests. This trade-off makes optimization of testing strategies 
complex and context-dependent, from both a technical and ethical perspective. 


METHODS 

We examine RDTs using three different metrics, characterizing their impact 
on individual patient outcomes (represented by the expected case fatality ra- 
tio (CFR) for a person suspected of having Ebola who is seeking care), the 
effectiveness with which health-care units reduce transmission (represented 
by the reproduction number for a person with a true Ebola infection seeking 
care), and the overall scale of an epidemic (represented by the total number 
of cases). We evaluate the first two metrics using a model that focuses on the 
impact of different testing strategies implemented in the context of a single 
health-care unit, and the third using a model of the transmission dynamics of 
Ebola in Sierra Leone as a whole. 


Potential impact of RDTs in a health-care unit 

Figure 1 (and Supplementary Information 2) illustrates the dynamics of pa- 
tients within a health-care unit represented in the health-care-unit-level mod- 
el that we developed to examine three diagnostic testing strategies: PCR-only, 
dual testing and RDT-only. Patients are tested within the holding area or areas 
to determine who should be sent to a confirmed ward and who should be dis- 
charged back to the community (with v the daily number of patients seeking 
care and p the proportion who are infected with Ebola). During the time spent 
in holding, patients without Ebola have a daily risk of Ebola infection of By, 
where f§ is the transmission rate and y is the prevalence of Ebola infection 
among other patients in that holding area. A similar risk affects patients mis- 
diagnosed with Ebola in the confirmed ward. At baseline we assume f§ = 0.15 
per day. 

For analytical tractability, we model a single generation of infection: peo- 
ple with suspected Ebola who enter the health-care unit uninfected, but then 
become infected and are assumed to be discharged back to the community 
or sent to the confirmed ward before they become infectious. When bed ca- 
pacity (the total number of beds in the health-care unit) has been exceeded 
by demand, we assume that patients seeking care are turned away and return 
to the community. 

For each testing strategy we use this health-care-unit-level model to eval- 
uate how many infected patients seeking care are sent to the confirmed ward 
and how many are discharged back to the community (owing to bed shortage 
or false-negative diagnosis when using the RDT-only strategy). In addition, 
and again for each testing strategy, we determine how many patients without 
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Ebola who seek care were infected during their stay in the hospital (in the 
holding and/or confirmed wards). 

To evaluate the impact of a testing strategy from a patient perspective, 
we compare the CFR among people with suspected Ebola seeking care with 
the CFR among those same patients had they not sought care (community 
CFR). Assuming that care reduces the CFR, hospitalization improves the out- 
come for infected patients. However, patients without Ebola within a health- 
care unit are at risk of nosocomial transmission; hence hospitalization is not 
necessarily optimal for a patient with unknown Ebola infection status. We use 
our model to explore these trade-offs and to determine the epidemiological 
contexts in which the different testing strategies are optimal. 

We assume a CFR of 60% among patients infected with Ebola through- 
out™®. Patients without Ebola seeking care in health-care units are likely to 
present with similar symptoms to those with the infection, including high 
fever, vomiting, diarrhoea and haemorrhaging. These are symptoms that are 
typically observed among patients with severe Lassa fever, a disease highly 
prevalent in West Africa. Therefore, we assume a 20% CFR among patients 
without Ebola, which is comparable with the reported CFR for severe Lassa 
fever cases admitted to hospital’. Furthermore, we assume that all patients 
admitted to the confirmed ward (including those without Ebola) benefit from 
hospital care, with r being the relative CFR of a hospitalized patient with Eb- 
ola (r = 1 representing no benefit of hospitalization). All patients sent to the 
confirmed ward are assumed to stay in the health-care unit for an average of 
7 days. 

The second important role of health-care units in Ebola outbreaks is to 
reduce transmission by isolating cases. To evaluate the impact of testing 
strategies on transmission at the level of a single health-care unit, we use our 
health-care-unit-level model to calculate the reproduction number of Ebo- 
la-infected patients seeking care, R,,, (the average number of secondary in- 
fections generated by these patients). This reproduction number reflects the 
potential impact of health-care units in reducing transmission and depends on 
the testing strategy used. We assess the epidemiological contexts for which 
different testing strategies are optimal. 

In examining both the patient and transmission perspectives, we explore 
likely scenarios at four different stages of the epidemic: in the early growth 
phase, at the peak, after the peak, and in the tail of the outbreak (reflecting 
the situation around June 2014, November 2014, January 2015 and May 2015, 
respectively). Over the course of the epidemic, we assume that the incidence 
of true cases increases, peaks and then declines; the number of patients with- 
out Ebola seeking care initially increases (for example, due to rising awareness 
as case numbers increase), and remains high as the epidemic wanes (mean- 
ing the prevalence of Ebola infection among suspected cases seeking care, p, 
wanes over time); the reproduction number among those not seeking care 
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Figure 1 | Three Ebola health-care-unit diagnostic testing and patient triage strategies. Patients seeking care are first admitted to a holding area where they wait to 
be either admitted to a confirmed ward or discharged back to the community. We examine three diagnostic testing strategies. a, Polymerase chain reaction (PCR)- 


only: patients await their test results in a single holding area. When PCR test results 


become available, individuals are either sent to a confirmed ward or discharged 


back to the community. b, Dual strategy (rapid diagnostic test (RDT) and PCR): based on initial RDT results, patients seeking care are kept in separate high- or low-risk 
wards. When PCR test results become available, individuals are either sent to a confirmed ward or discharged back to the community. c, RDT-only: the RDT result 
alone determines who is sent to a confirmed ward or discharged back to the community. Patients are either infected (red), uninfected (blue) or exposed within the 
holding area (infected, but not yet infectious, blue outline and red centre). The RDT is assumed to have a lower sensitivity and specificity than PCR, therefore RDT 
may give incorrect classifications (false positives and false negatives), which can result in nosocomial infections when non-Ebola cases are admitted, or a high risk of 


further transmission in the community if true Ebola cases are erroneously discharged. 


(community reproduction number) decreases, reflecting improved communi- 
ty control of transmission (for example, safer funeral practices); bed capacity 
increases and then plateaus once the epidemic starts to decline (see Table 2 
for parameter values). 

We implemented the health-care-unit-level model as an Excel spread- 
sheet and Java program (see Supplementary Information). This freely avail- 
able software allows the reader to explore the impact of different model as- 
sumptions and parameter values on model outcomes. 


Potential impact of RDTs at the population level 

To evaluate the potential impact of RDT use on the overall trajectory of an Eb- 
ola epidemic at the population level (compared with the impact at an individu- 
al- or health-care-unit level) we developed a susceptible-exposed-infectious- 
recovered (SEIR)-type transmission model, which was extended to include a 
highly infectious ‘near death’ stage reflecting increased transmission around 
the time of death and at funerals (see Supplementary Information 3.1.1). The 
model incorporates observed delays between key epidemiological events and 
is parameterized to reproduce the basic reproduction number, R,, observed 
during the Sierra Leone epidemic'° (model details are given in Supplemen- 
tary Information 3). 

We investigate the same diagnostic strategies as in the health-care-unit- 
level model (Fig. 1), but allow for full disease progression within each ward and 
in the community, with the flows between community and hospital wards be- 
ing determined by the testing delays and test characteristics. As in the health- 
care-unit-level model, bed capacity and utilization are tracked such that once 
all beds are occupied, patients seeking care cannot be admitted and therefore 
remain in the community. 

The model incorporates observed changes in the bed capacity availa- 
ble during the epidemic, making use of data on the number of ETU beds 
opened over time from the start of the outbreak to the end of May 2015. 
We include a constant number of 60 beds throughout to represent informal 
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holding units and other health-care facilities that were available before the 
start of the epidemic and were used for Ebola patients before the scale-up 
of ETU beds. We vary the mean onset to hospitalization delay each month 
to match reported performance indicators”'. At the start of the epidemic, 

relative infectiousness of a non-hospitalized person with Ebola during 

late stage of infection close to, and shortly after, death is assumed to be 

fold higher than infectiousness earlier in disease progression, but this fac 
is allowed to decrease over time. Higher late-stage infectiousness allows 

model to represent the enhanced risk of transmission to those caring 

dying patients at home, preparing the corpse of a family member or und 
taking funeral rites. We chose this value to yield a similar overall probabili 
of onward transmission from fatalities as that of survivors despite the short- 
er infectious period of fatal cases. Thus, around half of community trans- 
mission from a non-hospitalized fatal case occurs around the time of death. 
We assume safe handling of corpses and burials for all deaths in confirmed 
wards of health-care units. 

Assuming no use of RDTs, transmission parameters are calibrated to 
match the time series of observed incidence (confirmed and probable cas- 
es) and bed occupancy in Sierra Leone. The parameters calibrated include R,, 
he initial number of infectious cases present at the start of the simulation, 
he daily rate of hospitalizations of people without Ebola, the probability of 
care seeking for infected patients and the decrease in death-associated excess 
ransmissibility in the community. These last two parameters were allowed to 
change at three time points: 1 October 2014, 1 November 2014 and 1 January 
2015. The rate of people without Ebola seeking care is assumed to be propor- 
ional to the cumulative number of Ebola deaths up to that point. 

For the baseline scenario in both health-care-unit-level and popula- 
ion-level models, we assume 100% sensitivity and specificity of PCR testing, 
and 92% sensitivity and 85% specificity for the RDT, matching the published 
performance of the ReEBOV test". The average delays in obtaining RDT and 
PCR results are assumed to be 1 hour and 2 days, respectively. 


1 
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Table 2 | Illustration of health-care-unit model predictions for the CFR among patients seeking care (relative to the community CFR) and for the reproduction number 
of true Ebola patients seeking care for three testing strategies for levels of bed demand and true Ebola infection prevalence appropriate for four different stages of 
the current epidemic, as informed by the calibrated parameters of the population-level transmission model. 


Test name 


PCR-only Dual RDT-only — PCR-only Dual 
V (daily rate of arrival of patients)+ 3) 60 
P(%, prevalence) + 90 70 
Community reproduction number + 17 ths 
CFR among patients seeking care* 0.73 0.728 0.74 0.79 0.768 
Ra. 0.48 0.478 0.56 0.53 0.49§ 
Bed demand 33 30 330 
Bed capacityt, + 30 200 
CFR among patients seeking caret 0.75 0.74 0.748 0.87 0.86 
Ryyt 0.57 0.568 0.56 0.99 0.96 


During the peak 


Decreasing Going to zero 


RDT-only PCR-only Dual RDT-only — PCR-only Dual RDT-only 
60 60 
50 10 
0.85 0.85 
0.78 0.85 0.818 0.84 0.98 0.968 1.00 
0.59 0.36 0.28§ 0.37 0.49 0.48 0.75 
290 270 226 150 97 
350 350 
0.85§ 0.85 0.818 0.84 0.98 0.968 1.00 
0.94§ 0.36 0.28§ 0.37 0.49 0.4§ 0.75 


*Results assuming bed capacity (total number of beds in health-care unit) exceeds demand. ‘Results assuming health-care-unit bed numbers are limited. For the CFR results, we assume hospitalization 
decreases the CFR of patients admitted to the confirmed ward by a factor of 0.7. {Assumed model parameters. §The optimal strategies. Further model parameters are fixed at their baseline values 
(Supplementary Tables 4 and 6 show equivalent results with lower and higher nosocomial transmission rate). CFR, case fatality ratio; PCR, polymerase chain reaction; RDT, rapid diagnostic test. 


We also examine the impact of reducing the delay in obtaining PCR test 
results from 2 days to 1 day, the minimum delay realistically achievable when 
laboratory facilities are located close to health-care units. Although current 
RDTs have limited sensitivity and specificity, the next generation of diagnos- 
tic tests may have substantially improved characteristics. We therefore also 
investigate the impact of using near-perfect RDTs, assuming sensitivity and 
specificity of 99% each. 

Given the challenges faced during the epidemic such as sample collec- 
tion, storage, transport and laboratory processing, it is unlikely that the high 
nominal sensitivity and specificity of PCR testing was achieved. Moreover, 
considerable variation in the sensitivity and specificity of REEBOV RDT has 
been reported in the literature'*"®. As a sensitivity analysis, we repeat the sim- 
ulations using arguably more-realistic assumptions, regarding test sensitivity 
and specificity. We assume PCR test sensitivity and specificity of 85% and 
95%, respectively, and RDT sensitivity and specificity of 82% and 80%, re- 
spectively, and recalibrate the remaining model parameters. The larger rel- 
ative decreases in sensitivity and specificity of the PCR test compared with 
the RDT reflect the greater logistical constraints associated with the former. 


RESULTS 

After calibrating the population-level transmission model, we estimate that 
37% of people with Ebola sought health care at the start of the epidemic, in- 
creasing to 41%, 62% and 73% at the beginning of October 2014, November 
2014 and January 2015, respectively. The relative risk of transmission associ- 
ated with death is high up to the peak of the epidemic (16 until 30 September, 
and 15 until 31 October), and low after the epidemic peak (1.2 between 1 No- 
vember and 31 December, and 1.1 from January onwards). The rate of patients 
without Ebola seeking care equates to around 120 patients per day at the end 
of the epidemic. The calibrated value of R, is 1.7 (see Supplementary Table 11 
and Supplementary Information 3.4 for sensitivity analyses with regards to the 
calibrated model parameters). 


Potential impact of RDTs in a health-care unit 

From a patient perspective, hospitalization is not necessarily optimal and a 
trade-off exists between the risk of nosocomial infection and the benefits of 
treatment. At different stages of the epidemic and for the three testing strat- 
egies, Figure 2 shows the average CFR per person with suspected Ebola who 
seeks care as a function of the CFR reduction in the confirmed ward and of the 
ratio of bed capacity to patients seeking care per day. For hospitalization to be 
beneficial, the CFR among patients seeking care must be sufficiently reduced 
by treatment to compensate for the risk of nosocomial transmission to pa- 
tients without Ebola (Fig. 2). When most patients seeking care are true Ebola 
cases, a small benefit of treatment is sufficient to reduce the average CFR per 
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patient: at the peak (p = 0.7) using the dual strategy, CFR with treatment must 
fall below 97% of its community value (r = 0.97) to decrease the average CFR 
among patients seeking care (Table 2 and Fig. 2e). When most patients seek- 
ing care do not have Ebola, a higher benefit of treatment is required to reduce 
the average CFR per patient: at the tail of the outbreak (p = 0.1) using dual 
strategy, CFR with treatment must fall below 86% of its community value (r= 
0.86) to decrease the average CFR among patients seeking care (Fig. 2k). 
We first consider the impact of testing strategies when the demand for beds 
does not exceed capacity (Fig. 2). When hospitalization has little impact on 
CFR, the dual strategy is preferred as it reduces nosocomial transmission. 
However, RDT-only may become preferable (for example, at the tail of an 
epidemic; Fig. 2j-l) if the health-care-unit CFR is sufficiently lower than the 
community CFR such that the higher level of nosocomial transmission in the 
holding wards seen for the dual strategy outweighs the impact of increased 
nosocomial transmission in the confirmed wards seen for RDT-only, given that 
the patients in the latter will benefit from care. 

When demand for beds exceeds capacity (Fig. 2), the same reasoning ap- 
plies and RDT-only is increasingly favoured as it allows a larger proportion of 
cases to be admitted compared with the other strategies. 

For realistic parameter values, we estimate that the RDT-only strategy 
would be marginally optimal during the growing phase of an epidemic, where- 
as the dual-testing strategy would be marginally optimal in the later stage 
(Fig. 2, Table 2, Supplementary Table 3). For instance, during the peak of the 
epidemic, we assume 60 patients seek care daily and 200 beds are availa- 
ble (resulting in 3.3 beds per patient seeking care), a level of capacity that 
results in 40% of patients being turned away from the health-care units for 
the PCR-only (and dual) strategy, compared with 32% patients turned away 
using RDT-only. Furthermore, if we assume a 30% decrease in CFR for those 
patients sent to the confirmed ward (r= 0.7), the CFR among people with sus- 
pected Ebola seeking care (relative to community CFR) would be reduced by 
15% under the RDT-only strategy and by 14% under the dual strategy (Table 2, 
Fig. 2, Supplementary Table 4 for further sensitivity analyses). 

Figure 3 shows the impact of introducing RDTs on the transmission by a 
person with Ebola who seeks care for a scenario that is comparable with that 
in many affected areas at the peak of the recent West African epidemic (see 
Supplementary Figs 1-4 for results at other stages of the epidemic). We eval- 
uate the dual testing (Fig. 3a) and RDT-only (Fig. 3b) strategies relative to the 
impact of PCR-only testing. 

When bed capacity exceeds demand, the results illustrate some key con- 
clusions about the impact of RDTs on Ebola transmission: transmission is al- 
ways lowest for the dual strategy, with a reduction in the reproduction number 
of up to 14% compared with PCR-only (Fig. 3a, Table 2); depending on the 
RDT's sensitivity and specificity, the RDT-only strategy may result in lower 
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Figure 2 | Case fatality ratio (CFR) of patients seeking care divided by the CFR 

if those same Ebola and non-Ebola cases had remained in the community. Each 
row represents a particular stage in the epidemic: from left to right: early (a,b, 
c), during the peak (d,e,f), shortly after the peak (g,h,i) and once the epidemic 

is tailing off (j,k,l). Each column reflects a testing strategy, namely polymerase 
chain reaction (PCR)-only (a,d,g,j), dual strategy (b,e,h,k) and rapid diagnostic 
test (RDT)-only (c,f,i,l). White horizontal lines show the threshold bed capacity 
below which demand cannot be met for PCR-only (same threshold as dual 
strategy) and RDT-only. Solid grey and black lines (left panels, a,d,g,j) indicate, 
respectively, where the outcomes of PCR-only and RDT-only are equivalent, and 
where the outcomes of dual (RDT and PCR) testing and RDT-only are equivalent. 
Those lines delimit parameter space where (1) dual strategy is best followed by 
PCR-only and then RDT-only, (2) dual strategy is best followed by RDT-only and 
then PCR-only and (3) RDT-only is best followed by dual strategy and then PCR- 
only. On the left of the white solid vertical line (specific for the testing strategy), 
the benefit of care is sufficient to decrease the average CFR among patients 
seeking care (unaware of their disease status, and assuming hospital infection 
control has not improved over the course of the epidemic). The black arrows on 
the right y-axis of the RDT-only plots indicate the likely availability of beds at 
the corresponding stage of the epidemic (Table 2); however, this is likely to have 
varied between different health-care units. 


transmission than the PCR-only strategy (Fig. 3b, Table 2). Low RDT sensitivity 
worsens the predicted outcome of the RDT-only strategy much more than the 
outcome of the dual strategy (Fig. 3a,b). 

When bed capacity cannot meet demand, the RDT-only strategy may give 
lower transmission than both the PCR-only and the dual strategies (Figs 3c, 
d, Table 2). Relying on RDTs alone is optimal during the peak of the epidemic 
owing to the strategy’s better use of bed capacity (Table 2, Fig. 3c,d, Supple- 
mentary Information 2.3, Supplementary Figs 2-5). The benefit of the RDT- 
alone strategy increases as the level of infection control in the health-care unit 
decreases (for example see Supplementary Table 6 for results with the rate of 
nosocomial transmission set at 0.1 and 0.2 per day). Although the reduction in 
the reproduction number may seem modest during the growing phase of the 
epidemic (under 11%; Table 2, Supplementary Fig. 1), a reduction of this size 
can have a considerable impact on cumulative case numbers. 
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Figure 3 | Relative reproduction number of patients with Ebola who are 

seeking care for dual strategy and rapid diagnostic test (RDT)-only, compared 

to polymerase chain reaction (PCR)-only, at the peak of the epidemic (see 

Table 2 for parameters). For a,b bed capacity is unlimited, whereas for c,d the 
health-care unit has 200 beds (Table 2). The outcome using a dual (RDT and PCR) 
strategy is shown in a,c, whereas b,d present the outcome using RDT-only. For 
the specified parameters and when bed capacity is unlimited, the reproduction 
number for the PCR-only strategy is 0.53 (0.99 when bed capacity is limited 

to 200). The PCR-only outcome is independent of the RDT’s sensitivity and 
specificity. Solid grey and black lines indicate, respectively, where the outcomes 
of PCR-only and RDT-only are equivalent, and where the outcomes of dual (RDT 
and PCR) testing and RDT-only are equivalent. Those lines delimit parameter 
space where (1) dual strategy is best followed by PCR-only and then RDT-only, (2) 
dual strategy is best followed by RDT-only and then PCR-only and (3) RDT-only is 
best followed by dual strategy and then PCR-only. The black circle indicates the 
World Health Organization reported sensitivity and specificity of the REEBOV 
RDT (92% and 85%, respectively)». 


Potential impact of RDTs at the population level 
We achieve a good match between observed incidence in Sierra Leone and 
the population-level transmission model after calibration, assuming PCR-only 
testing (Fig. 4a). We then use the model to evaluate the potential impact of 
RDTs (Fig. 4, Table 3). We estimate that adopting an RDT-only strategy would 
have reduced the size of the epidemic by 6%, despite imperfect sensitivity 
leading some patients with Ebola to be discharged back to the community 
and continuing to transmit infection there, particularly in the tail of the out- 
break. The number of patients with Ebola discharged from hospital is similar 
under both the PCR-only and RDT-only strategies; for the PCR-only strategy, 
these are nosocomial infections, whereas for the RDT-only strategy they are 
predominantly cases with false-negative test results. When demand for beds 
exceeds capacity, the quick turnaround time of the RDT considerably reduces 
the need for hospital beds, and consequently fewer patients are turned away 
from hospital (Fig. 4b and Supplementary Fig. 8). Thus, RDT-only might be the 
preferred testing strategy, particularly when bed demand is exceeded. 
However, with the limited sensitivity and specificity of existing RDTs, 
adopting a dual strategy would have been more effective than relying on ei- 
ther PCR- or RDT-testing alone. We estimate that such a dual strategy could 
have reduced the size of the epidemic by 32% compared with the PCR-only 
strategy (Fig. 4, Table 3). Note that the precise reduction in case numbers 
achieved crucially depends on the risk of nosocomial transmission. The seg- 
regation of cases into high- and low-risk wards sufficiently reduces nosoco- 
mial transmission to substantially reduce the number of newly infected cases 
being discharged back to the community. This more efficient disruption of 
transmission within both health-care-unit and community settings leads to 
lower incidence early in the epidemic, and consequently a lower bed demand 
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Table 3 | Summary statistics of the different testing scenarios considered. The upper rows assume test performance as in the baseline scenario, whereas the lower 
rows assume real-world use leads to a reduction in test performance as described in the methods. 


Total number of cases (% 
relative to PCR-only) 


Scenario 


Peak weekly incidence 


Total number of 
Ebola cases turned 
away from health- 
care units due to 


Total number of infected 
patients discharged 


lack of beds 
Perfect PCR and PCR-only 11,600 (100) 620 780 490 
nominal RDT 
characteristics RDT-only 10,800 (94) 580 960 60 
Dual testing 7,900 (68) 460 280 40 
Fast PCR 8,100 (70) 480 320 0 
Near perfect 6,700 (58) 390 70 0 
RDT-only 
PCR-only 11,500 (100) 630 1,820 710 
RDT-only 10,700 (93) 600 1,760 260 
Dual testing 9,100 (79) 520 1,270 290 
Fast PCR 8,800 (77) 540 1,260 70 


PCR, polymerase chain reaction; RDT, rapid diagnostic test 


throughout, reducing the time periods when bed demand exceeds capacity 
(Fig. 4b, Supplementary Fig. 9). 

Reducing the average delay in obtaining PCR test results from 2 days to 
1 day could have reduced overall case numbers by 30% without recourse to 
RDTs (Fig. 4, Table 3). The impact of faster PCR testing is twofold: a reduction 
in the average stay in health-care units reduces demand for beds in the holding 
wards, and reduces nosocomial transmission. 

If near-perfect RDTs were available (with both sensitivity and specificity 
at 99%), we estimate that the epidemic in Sierra Leone could have been re- 
duced by 42% — substantially better than even 1-day-turnaround PCR testing 
could achieve. In this scenario, the duration of health-care-unit stay for peo- 
ple who are suspected of having, but are not infected with, Ebola is minimal, 
implying a considerably lower demand for beds (Fig. 4b). The near-perfect 
RDT generates a very small number of misdiagnoses, therefore reducing both 
the numbers of false-negative discharges and false-positive admissions to the 
confirmed ward — saving available beds for patients truly infected with Ebola 
(Supplementary Fig. 9). 

Overall, model predictions of the epidemic size are highly sensitive to the 
assumed values of RDT sensitivity and specificity for the RDT-only strategy, 
ranging from a 42% reduction for a near perfect test to nearly a twofold in- 
crease if sensitivity and specificity are very poor. However, model predictions 
for the dual strategy are much less sensitive to the RDT characteristics, with 
the reduction in epidemic size only varying in the range 27% to 41% (see Sup- 
plementary Information 3.4). 

Sensitivity analyses that assume lower PCR and RDT sensitivity and spec- 
ificity (owing to challenges in sample-taking and transport) give qualitatively 
similar results (Table 3). 


DISCUSSION 

Our results support the WHO advice on the use of RDTs, showing that RDTs 
could improve the control of Ebola epidemics, particularly in contexts in which 
laboratory or bed capacity is limited compared with demand, and infection 
control between patients in health-care units is imperfect. Implementing a dual 
testing strategy (whereby RDTs are used for early triage into high- and low-risk 
holding areas, followed by confirmation by PCR before admission to confirmed 
wards or patients are discharged) has the potential to decrease nosocomial 
transmission, leading to a sizeable reduction in the final epidemic size. 

Crucial to the assessment of the benefits of RDTs in Ebola control is the 
ever-present risk of nosocomial transmission. Even when health-care-unit bed 
capacity can meet demand, a trade-off exists between using a slow PCR test 
and relying solely on RDTs. The longer wait for PCR test results increases the 
risk that suspected patients who are not actually infected with Ebola will be- 
come infected in holding areas. However, relying on RDTs alone has two poten- 
tial negative consequences: false positives due to imperfect specificity, which 
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leads to increased nosocomial transmission as non-Ebola patients are sent to 
the confirmed ward; and false negatives due to imperfect sensitivity, leading to 
discharge of infected individuals back to the community. When most people 
with suspected Ebola seeking care are infected (as might be expected in the 
growth phase of an epidemic), high sensitivity is the more important of these 
two factors to reduce the risks of discharging false negatives. When most peo- 
ple with suspected Ebola are not infected with Ebola (as expected in the tail 
of an epidemic) high specificity is needed to minimize the number of false 
positives. As nosocomial transmission risks increase (for example, due to high 
demand on services, overcrowding or limitations of infection control), so do 
the benefits of adopting RDTs in combination with PCR testing. 

The prevalence of Ebola infection in patients seeking care and the risk of 
nosocomial transmission crucially determine the optimal testing strategy. In- 
termediate values of the underlying prevalence of Ebola infection among those 
seeking care give the highest risk of nosocomial transmission by maximizing 
contact between patients with and without Ebola. 

Although the average bed occupancy per suspect case is the same for the 
PCR-only and dual strategies, the RDT-only strategy affects bed usage: less 
time is spent awaiting test results whereas the proportion of patients sent to 
the confirmed ward is determined by RDT sensitivity and specificity. Here, the 
combination of high sensitivity and specificity is of course optimal — a poor 
specificity of RDT may increase overall bed occupancy by causing increased 
numbers of patients without Ebola to be falsely confirmed with Ebola and ad- 
mitted to the confirmed wards. 

During the current epidemic there were periods in which bed demand 
exceeded capacity. In such circumstances, our analysis suggests that relying 
on RDTs alone could substantially improve utilization of bed capacity, despite 
less than perfect diagnostic sensitivity and specificity. We estimate that using 
the RDT-only strategy throughout the epidemic would have decreased case 
numbers by a modest 6% in Sierra Leone. This is because the better bed utili- 
zation under this strategy is counterweighed by the higher false-positive rate 
of current RDTs compared with PCR; this becomes more of an issue in the tail 
of an epidemic during which we expect the prevalence of true Ebola infection 
among people with suspected Ebola to decline. However, the RDT-only strat- 
egy would be a more attractive option if infection control in health-care units 
could be increased in the tail of an epidemic, mitigating this negative effect. 

The dual strategy using both PCR and RDT was predicted to have the 
greatest impact on epidemic size, potentially reducing the size of the epidemic 
in Sierra Leone by a third. A similar reduction could have been achieved with 
PCR testing alone if results had been consistently reported back to health- 
care units within 24 hours. However, as the logistical challenges involved in 
setting up rapid, high-throughput PCR testing close to the point of care are 
substantial, RDTs are likely to be a useful addition to the diagnostic armoury 
for combatting future Ebola epidemics. 
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Figure 4 | Ebola outbreak in Sierra Leone. a, Observed (grey bars) and 
expected (coloured lines) weekly incidence of confirmed and probable Ebola 
cases during the outbreak in Sierra Leone. The red line presents the expected 
incidence using the polymerase chain reaction (PCR)-only strategy on which 
the model was calibrated. Other lines present the estimated incidence 

under counterfactual scenarios: rapid diagnostic test (RDT)-only (blue), dual 
strategy (green), PCR-only with faster test results delivered within 1 day 
(purple) or RDT-only with near-perfect sensitivity and specificity of 99% each 
(orange). b, Bed capacity (grey) and usage (colours) throughout the epidemic 
in Sierra Leone for the same scenarios as in (a). 


Our models inevitably simplify the complex processes underlying Ebola 
transmission dynamics and patient testing and treatment. However, they pro- 
vide an important initial evaluation of the potential impact of RDT use on out- 
break control. Our qualitative conclusions were robust to sensitivity analyses, 
and although the numerical predictions of reductions in epidemic size should 
be cautiously interpreted, the benefit of using RDTs in combination with PCR 
testing remained. In the baseline population level analysis, we assumed per- 
fect sensitivity and specificity of PCR testing, but logistical challenges in field 
conditions might lead to substantially lower real-world performance'®”*. How- 
ever, we found that the predicted qualitative impact of RDTs was robust to 
possible suboptimal sensitivity and specificity of PCR tests (Table 3). 

Furthermore, there is considerable uncertainty in RDT performance, with 
reported sensitivity and specificity for ReEBOV varying between 78% and 
100%, and 85% and 92%, respectively"®. Such differences would have sub- 
stantial epidemiological and clinical consequences should an RDT-only test- 
ing strategy be adopted. This uncertainty also limits our ability to assess the 
potential impact of RDTs, particularly if used alone. The predicted impact of 
dual-testing strategies is less sensitive to uncertainty in RDT performance, 
with all reported values for ReEBOV resulting in a considerable reduction of 
the epidemic size (between 27% and 41%). 

Finally, for the main analysis presented here, we conservatively assumed 
that RDTs would be less accurate than PCR tests. Although this is true for 
ReEBOV"'® (Supplementary Table 1), some alternative RDTs in development 
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seem to be as accurate as their PCR counterparts, even when field tested (for 
example, REVAMP?3, J.-C. Manuguerra, personal communication; Supple- 
mentary Table 1). Such tests could have a dramatic impact on the control of 
future epidemics in which health care or laboratory capacity is limited (Table 
3). Therefore as innovation in RDT development continues, RDTs could quickly 
replace PCR testing, achieving similar accuracy with the benefit of faster re- 
sults, simpler logistics and safer handling of isolates. This article has evaluated 
a limited set of diagnostic testing strategies for Ebola and further work should 
evaluate alternative diagnostic methods tailored to field procedures. In Sierra 
Leone, the epidemic response was decentralized with each district hosting an 
Ebola response centre, which investigated local cases based on contact trac- 
ing, potential exposure and local knowledge. In such settings, an alternative 
testing strategy could see RDTs used in the community. This could reduce bed 
demand, costs, patient-transportation needs and risks of transmission. Fur- 
thermore, such community use of RDTs could potentially identify cases that 
would not have met clinical case definitions, and would send a positive mes- 
sage to the community that the testing is fast and transparent. 

In the recent West African Ebola epidemic it has been difficult to obtain 
information about the changes over time in funeral practices, care-seeking 
behaviour, bed capacity and population mobility. In addition, the level of case 
ascertainment is uncertain, but is likely to have been poor at certain times 
in some areas"425, These data gaps may be at least partly filled if priority is 
placed on collation of the wealth of local knowledge and data that may oth- 
erwise become lost as the people who contributed to the response resume 
other activities. 

Updating standard diagnostic and triage procedures for Ebola to include 
the use of RDTs could offer substantial benefits both from a patient and 
health-care-unit perspective, and in improving overall control of an epidemic. 
Until now, Ebola outbreaks were thought to be easily controlled. The usual 
narratives surrounding the control of Ebola highlight the importance of safe 
funerals, prompt isolation and effective contact tracing. However, testing 
strategies can also have a crucial role in minimizing the opportunity for noso- 
comial transmission and maximizing bed utilization. The ease of use of RDTs, 
particularly in a resource-poor setting, makes them potentially powerful tools 
for rapid detection and containment of future outbreaks. 


1. World Health Organization. Laboratory Diagnosis of Ebola Virus Disease (WHO, 2014). 

2. World Health Organization. Urgently Needed: Rapid, Sensitive, Safe and Simple Ebola Di- 
agnostic Tests http://www.who.int/mediacentre/news/ebola/18-november-2014-diag- 
nostics/en/ (2014). 

3. Altona Diagnostics. RealStar® Filovirus Screen RT-PCR Kit 1.0. (Altona, 2014). 

Obelis, S. A. & Shanghai, Z. J. Bio-Tech Co. Ltd. LifeRiver Ebola Virus (EBOV) Real Time RT- 
PCR Kit User Manual (Bio-Tech, 2012). 

5. Saijo, M. et al. Laboratory diagnostic systems for Ebola and Marburg hemorrhagic fevers 
developed with recombinant proteins. Clinical Vaccine Immunol. 13, 444-451 (2006). 

6. Towner, J. S. et al. Rapid diagnosis of Ebola hemorrhagic fever by reverse transcrip- 
tion-PCR in an outbreak setting and assessment of patient viral load as a predictor of 
outcome. J. Virology 78, 4330-4341 (2004). 

7. Strecker, T. et al. Field evaluation of capillary blood samples as a collection specimen for 
the rapid diagnosis of Ebola virus infection during an outbreak emergency. Clinical Infect. 
Dis. 61, 669-675 (2015). 

8. Chua, A.C., Cunningham, J., Moussy, F, Perkins, M. D. & Formenty, P. The case for improved 
diagnostic tools to control Ebola virus disease in West Africa and how to get there. PLoS 
Negl. Trop. Dis. 9, €0003734 (2015). 

9. Zachariah, R. & Harries, A. D. The WHO clinical case definition for suspected cases of 
Ebola virus disease arriving at Ebola holding units: reason to worry? Lancet Infect. Dis. 
15, 989-990 (2015). 

10. Pathmanathan, I. et al. Rapid assessment of Ebola infection prevention and control needs 
— six districts, Sierra Leone, October 2014. Morb. Mortal. Wkly Rep. 63, 1172-1174 (2014). 

11. World Health Organization. Ebola Situation Report— 26 November 2014 http://apps.who. 
int/ebola/en/ebola-situation-report/situation-reports/ebola-situation-report-26-no- 
vember-2014 (WHO, 2014). 

12. World Health Organization. Ebola Situation Report — 21 January 2015 http://apps. 
who.int/ebola/en/status-outbreak/situation-reports/ebola-situation-report-21-janu- 
ary-2015 (WHO, 2015). 

13. World Health Organization. WHO Emergency Quality Assessment Mechanism for EVD 
IVDs. Product: RealStar® Filovirus Screen RT-PCR Kit 1.0 (WHO, 2015). 

14. Corgenix. ReEBOV Antigen Rapid Test (Ebolavirus VP40 Antigen Detection) Instructions for 
Use (Corgenix, 2015). 

15. World Health Organization. WHO Emergency Use Assessment and Listing for Ebola Virus 
Disease IVDs. Product: ReEBOV Antigen Rapid Test Kit (WHO, 2015). 

16. Broadhurst, M. J. et al. REEBOV Antigen Rapid Test kit for point-of-care and laborato- 
ry-based testing for Ebola virus disease: a field validation study. Lancet 386, 867-874 
(2015). 


$115 


DIAGNOSTICS AND EBOLA | NOUVELLET ET AL. 


17. 


18. 


19. 


20. 


21. 


22. 


23. 


24. 


25. 


26. 
27. 


Food and Drug Administration. 2014 Ebola Virus Emergency Use Authorizations http:// 
www.fda.gov/medicaldevices/safety/emergencysituations/ucm161496.htm#ebola 
(FDA, 2014-2015). 

WHO Ebola Response Team. West African Ebola epidemic after one year — slowing 
but not yet under control. N. Engl. J. Med. 372, 584-587 (2015). 

National Center for Emerging and Zoonotic Infectious Diseases CDC. Lassa Fever 
http://www.cdc.gov/vhf/lassa/pdf/factsheet.pdf (CDC, 2015). 

WHO Ebola Response Team. Ebola virus disease in West Africa — the first 9 months of 
the epidemic and forward projections. N. Engl. J. Med. 371, 1481-1495 (2014). 

World Health Organization. Ebola Situation Report — 27 May 2015 http://apps.who. 
int/ebola/en/current-situation/ebola-situation-report-27-may-2015 (WHO, 2015). 
Bhadelia, N. Rapid diagnostics for Ebola in emergency settings. Lancet 386, 833-835 
(2015). 

Manuguerra, J.-C. Molecular pathogen detection in the field, directly at the point of 
care. Proc. 3rd Int. Congress on Targeting Infectious Diseases (Task Force Infectious Dis- 
ease, 2015). 

Meltzer, M. I. et al. Estimating the future number of cases in the Ebola epidemic — 
Liberia and Sierra Leone, 2014-2015. Morb. Mortal. Wkly. Rep. 63, 1-14 (2014). 

World Health Organization. Ebola Situation Report — 14 January 2015 http://apps. 
who.int/ebola/en/status-outbreak/situation-reports/ebola-situation-report-14-janu- 
ary-2015 (2015). 

Cepheid. Cepheid Xpert Ebola Assay (Cepheid, 2015). 

World Health Organization. WHO Emergency Use Assessment and Listing for EVD IVDs 
PUBLIC REPORT Product: Xpert® Ebola Assay (WHO, 2015). 


S116 


28. World Health Organization. WHO Emergency Use Assessment and Listing Procedure for 
EVD IVDs. PUBLIC REPORT. Product: Liferiver™ — Ebola Virus (EBOV) Real Time RT-PCR 
Kit (WHO, 2015). 


SUPPLEMENTARY MATERIAL 
Is linked to the online version of this paper at: http://dx.doi.org/10.1038/nature16041 


ACKNOWLEDGEMENTS 

We thank J.-C. Manuguerra for sharing preliminary results on the effectiveness of the RE- 
VAMP RDT and L. Simonsen for early discussions. We furthermore acknowledge research 
funding from the Medical Research Council, the Bill and Melinda Gates Foundation, the 
MIDAS network of the National Institute of General Medical Sciences (National Insti- 
tutes of Health), the Health Protection Research Units of the National Institute for Health 
Research, the European Union Seventh Framework Programme [FP7/2007—2013] under 
Grant Agreement no 278433-PREDEMICS, and the Wellcome Trust. 


COMPETING FINANCIAL INTERESTS 
The authors declare no competing financial interests. Financial support for this publica- 
tion has been provided by the Bill & Melinda Gates Foundation. 


ADDITIONAL INFORMATION 
® This work is licensed under the Creative Commons Attribution 4.0 Inter- 
national License. The images or other third party material in this article 
are included in the article’s Creative Commons license, unless indicated 
otherwise in the credit line; if the material is not included under the Creative Commons 
license, users will need to obtain permission from the license holder to reproduce the mate- 
rial. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0 


3 December 2015 | 7580 | 528 


